-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing routes caused by Node events coming out of order #1526
Conversation
Thanks for your PR. The following commands are available:
|
Codecov Report
@@ Coverage Diff @@
## master #1526 +/- ##
==========================================
+ Coverage 67.54% 67.59% +0.05%
==========================================
Files 169 169
Lines 13424 13471 +47
==========================================
+ Hits 9067 9106 +39
+ Misses 3416 3415 -1
- Partials 941 950 +9
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/test-all |
// event to be processed before proceeding, or the route installation and uninstallation operations may override or | ||
// conflict with each other. | ||
if len(nodesHaveSamePodCIDR) > 0 { | ||
return fmt.Errorf("podCIDR %s for Node %s is duplicate with Node %s", node.Spec.PodCIDR, nodeName, nodesHaveSamePodCIDR[0].(*nodeRouteInfo).nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention that this error could be temporary? or something like addNoteRoute is deferred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the suggestion, done
// event to be processed before proceeding, or the route installation and uninstallation operations may override or | ||
// conflict with each other. | ||
if len(nodesHaveSamePodCIDR) > 0 { | ||
return fmt.Errorf("podCIDR %s for Node %s is duplicate with Node %s", node.Spec.PodCIDR, nodeName, nodesHaveSamePodCIDR[0].(*nodeRouteInfo).nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, also agree with Yang's suggestion
LGTM |
PodCIDRs can be released from deleted Nodes and allocated to new Nodes. For server side, it won't happen that a PodCIDR is allocated to more than one Node at any point. However, for client side, if a resync happens to occur when there are Node creation and deletion events, the informer will generate the events in a way that all creation events come before deletion ones even they actually happen in the opposite order on the server side. Therefore, a PodCIDR may appear in a new Node before the Node that previously owns it is removed. To ensure the stale routes, flows, and relevant cache of this podCIDR are removed appropriately, we wait for the Node deletion event to be processed before proceeding, or the route installation and uninstallation operations may override or conflict with each other.
915c4c8
to
f93cf30
Compare
/test-all |
/test-windows-conformance |
"TestHostPortPodConnectivity" in "Kind / E2e tests on a Kind cluster on Linux with Antrea NetworkPolicies enabled" failed because of "ErrImagePull", since the test has passed in other verifications, I will skip rerunning all kind tests.
|
…io#1526) PodCIDRs can be released from deleted Nodes and allocated to new Nodes. For server side, it won't happen that a PodCIDR is allocated to more than one Node at any point. However, for client side, if a resync happens to occur when there are Node creation and deletion events, the informer will generate the events in a way that all creation events come before deletion ones even they actually happen in the opposite order on the server side. Therefore, a PodCIDR may appear in a new Node before the Node that previously owns it is removed. To ensure the stale routes, flows, and relevant cache of this podCIDR are removed appropriately, we wait for the Node deletion event to be processed before proceeding, or the route installation and uninstallation operations may override or conflict with each other.
PodCIDRs can be released from deleted Nodes and allocated to new Nodes. For server side, it won't happen that a PodCIDR is allocated to more than one Node at any point. However, for client side, if a resync happens to occur when there are Node creation and deletion events, the informer will generate the events in a way that all creation events come before deletion ones even they actually happen in the opposite order on the server side. Therefore, a PodCIDR may appear in a new Node before the Node that previously owns it is removed. To ensure the stale routes, flows, and relevant cache of this podCIDR are removed appropriately, we wait for the Node deletion event to be processed before proceeding, or the route installation and uninstallation operations may override or conflict with each other.
PodCIDRs can be released from deleted Nodes and allocated to new Nodes.
For server side, it won't happen that a PodCIDR is allocated to more
than one Node at any point. However, for client side, if a resync
happens to occur when there are Node creation and deletion events, the
informer will generate the events in a way that all creation events come
before deletion ones even they actually happen in the opposite order on
the server side. Therefore, a PodCIDR may appear in a new Node before
the Node that previously owns it is removed.
To ensure the stale routes, flows, and relevant cache of this podCIDR
are removed appropriately, we wait for the Node deletion event to be
processed before proceeding, or the route installation and
uninstallation operations may override or conflict with each other.
Fixes #1527