-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AddressGroup memberKey not updated on pod IP update #1808
Conversation
1c7454e
to
99a48e9
Compare
Codecov Report
@@ Coverage Diff @@
## main #1808 +/- ##
=======================================
Coverage ? 53.36%
=======================================
Files ? 193
Lines ? 16435
Branches ? 0
=======================================
Hits ? 8770
Misses ? 6572
Partials ? 1093
Flags with carried forward coverage won't be shown. Click here to find out more. |
99a48e9
to
9b1180f
Compare
ee57d60
to
450cb03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the quick fix.
450cb03
to
82907f9
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I forgot to mention we could have an e2e case for this scenario. I think it can reproduced by killing sandbox container of a Pod.
sounds like a unit test which ensures that the controller sends an update when the IP address for the Pod changes would be sufficient. Do we have such a test in the PR already? |
82907f9
to
cf04f4d
Compare
Thanks for the suggestion. Just added such test in addressgroup_test.go and confirmed that the case would fail without this patch and pass with the PR. |
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the quick fix and for adding the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Fix identical GroupMember key across pod IP update * Add UT and improve agent AddressGroup update logic
* Fix identical GroupMember key across pod IP update * Add UT and improve agent AddressGroup update logic
* Fix identical GroupMember key across pod IP update * Add UT and improve agent AddressGroup update logic
* Fix identical GroupMember key across pod IP update * Add UT and improve agent AddressGroup update logic
* Fix identical GroupMember key across pod IP update * Add UT and improve agent AddressGroup update logic
This PR fixes #1807.
PR #1467 unified GroupMemberPod and GroupMember into a single struct. Before the unification, PodReference is not set for GroupMemberPods used in AddressGroups, but set for GroupMemberPods used in AppliedToGroups. In #1467, In order to identify whether a GroupMember contains Pod or EE when converting it to GroupMemberPod (for backward compatibility), PodRef was added to GroupMembers all the time instead.
This leads to two separate issues. First issue is #1587, where agents running v1beta1 do not expect PodReference in addressGroup members. This was fixed by @tnqn thru not including podRef in addressGroup conversion function #1586.
Second issue is #1807, where addressGroup updates do not propagate to agents once the Pod IPs are changed after Node restart. This is because if the PodRef is always set for GroupMembers, the memberKey will always be namespaced name, and the key does not change on Pod IP updates.
To fix the second issue, this PR modifies the way
normalizeGroupMember
function is implemented. The new hashed key of a GroupMember will include all the fields in a GroupMember if that field is set. Namely, for GroupMembers used in AppliedToGroups, the members will have Pod/ExternalEntityReference set. These members will be identified by namespaced-name. For GroupMembers used in AddressGroups, the members will have BOTH Pod/ExternalEntityReference and IPs set. These members will be identified by entity reference + IPs.For this change to be backward compatible, we need to ensure that across Antrea controller upgrade, all AddressGroups have an updated name. Otherwise, on the old side, AddressGroup update events will not work as expected, since the GroupMember key for the same GroupMembers will be different. This will lead to the same issue pointed out in #1587, where the same addressGroup member IPs gets deleted after being added.
Hence, even if two addressGroups have the same members (different keys) before and after controller upgrade, the Antrea agent needs to treat it as AddressGroup add and delete events, rather than an update. To that end, the controller uses a new
uuidNamespace
to ensure that for the AddressGroups/AppliedToGroups with same members, distinct group names will be generated across the update.In addition, this PR also fixes a potential issue in Antrea agent in terms of calculating AddressGroup updates in reconciler (this was the issue which led to #1587, but wasn't address then, because in upgrade scenarios agent code cannot be patched). When calculating the added and deleted addresses in an AddressGroup, only IPs should be taken into account. Performing a diff on GroupMembers and then take the diff IPs is error-prone when the agent and controller have different hash functions for GroupMembers.