-
-
Notifications
You must be signed in to change notification settings - Fork 749
vmac: fix creation / deletion VMAC issue due to race conditions #2388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
d8d79f3
to
4850a6b
Compare
Due to a race condition, the VMAC interface sometimes remains undeleted after Netlink reports the removal of its link interface. The function cleanup_lost_interface() fails to delete the interface. This failure occurs because the condition 'IF_BASE_IFP(vrrp->ifp) != ifp' in the following code segment never matches: > if (vrrp->ifp != ifp > #ifdef _HAVE_VRRP_VMAC_ > && IF_BASE_IFP(vrrp->ifp) != ifp && VRRP_CONFIGURED_IFP(vrrp) != ifp > #endif This non-matching situation arises because 'vrrp->ifp->base_ifp' is the same as 'vrrp->ifp'. During the initialization in if_get_by_ifname(), when a new interface_t pointer is created, 'ifp->base_ifp' is set as follows: > ifp->base_ifp = ifp; netlink_if_link_populate() gets bypassed whenever 'ifp->base_ifp->ifindex' is different from the index of the actual link interface. This bypass is inevitable when 'ifp' is newly allocated. netlink_if_link_populate() normally populates the 'ifp' data including the 'ifp->base_ifp'. Fix the issue of VMAC deletion by adjusting the logic so that netlink_if_link_populate() does not get skipped when the base_ifp of a VMAC is the VMAC itself. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add #endif descriptions in netlink_if_link_populate() to clarify in which #ifdef blocks the next commit is modifying the code. This is a cosmetic change. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The VMAC interface sometimes does not reappear when its associated link interface is quickly re-added after being deleted, a situation caused by a race condition. This problem manifests during the operations of cleanup_lost_interface() This function checks for VMAC interfaces on top of the removed link interface. It deletes these VMAC interfaces if they are present. Subsequently, Netlink is invoked to refresh the information of all interfaces, and this is followed by the cleaning of the link interface data in memory. The problem occurs when Netlink, queried indirectly by sub-functions of cleanup_lost_interface(), detects that the link interface has reappeared. Although the interface data in memory is updated accordingly, cleanup_lost_interface() unconditionally clears this refreshed information. As a result, the data regarding the link interface is lost, preventing the re-creation of its associated VMAC interfaces. Fix the VMAC creation issue by adding a 'cleaning flag' that is set at the start of the cleanup process. This flag says whether to proceed with the interface data cleanup. If the interface is refreshed during the Netlink polling, the flag is unset, thereby preventing the subsequent clearing of the updated interface information. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
4850a6b
to
0624ca4
Compare
@pqarmitage Please review this |
@louis-6wind I was looking at these commits yesterday and trying to work out what was happening. I think commit 984ac06 may have improved the situation, since it now doesn't delete any VMACs until it has worked through all the VRRP instances tracking a deleted interface, and only deletes the VMACs once the internal data structures have all been updated. I would be grateful for your thoughts as to whether commit 984ac06 affects any of your changes. I am currently trying to work my way through all the outstanding pull requests, either to merge them, review them, or explain why they cannot be merged. I note that you have been collaborating with Hari Sasank (@harisasank) and I am confused by pull requests #2565 and #2556, which appear to be addressing the same issue, although I note that #2565 has been closed now. The original commit (1960568) appears to have disappeared, and we now have three new commits (I haven't had time to look at these yet, since I was trying to work out what was happening between #2556 and #2565). |
Fix two race conditions issue where the VMAC interfaces are not created or deleted following the apparition or disparition of their link interface.