New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the restriction that an ACNP referred ClusterGroup must be created before the ACNP #2478
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2478 +/- ##
==========================================
- Coverage 60.58% 59.34% -1.24%
==========================================
Files 286 287 +1
Lines 23096 23107 +11
==========================================
- Hits 13993 13714 -279
- Misses 7644 7956 +312
+ Partials 1459 1437 -22
Flags with carried forward coverage won't be shown. Click here to find out more.
|
9320e68
to
c1733da
Compare
c1733da
to
f1de0be
Compare
f1de0be
to
fd5faea
Compare
| if err != nil { | ||
| klog.Errorf("Unable to delete internal Group %s from store: %v", key, err) | ||
| } | ||
| n.triggerParentGroupSync(grp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have been triggering CNP updates from workers.. is it possible to do so in syncInternalGroup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be very tricky to do so in syncInternalGroup, in case of deletion. If a group key is processed by syncInternalGroup, the controller would expect the internal group still exist in the internal group store, otherwise it won't do anything. So in this case, if we want to rely on syncInternalGroup for parent group syncs, we would need to delete the internal group from the internal store after syncInternalGroup. However, at the time of parent group sync we also want to ensure that the childGroup is deleted from internal store, so that we don't include outdated members. Since this process is async, it's hard for us to ensure that syncInternalGroup of the parent happens after the internal group deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm im worried this comes in the path of the synchronous Delete handlers and may cause perf issue. maybe add a TODO comment here to explain why its done here now and perhaps we should look in to moving this to the workers. let me also dig deeper if we can have an alternative here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you plan to add a TODO here? more important than ParentGroup my comment is on the triggerCNPUpdates; as that's the one i presume would be time consuming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder. Added TODO.
fd5faea
to
6106fad
Compare
6106fad
to
021c563
Compare
|
/test-all |
021c563
to
40a106f
Compare
|
/test-all |
40a106f
to
b83346c
Compare
|
rebased /test-all |
b83346c
to
9fea7ce
Compare
|
/test-all |
|
yup. i will take care of this PR .. let us first merge the static ipblock PR #2577 and then I will rebase all conflicts together |
…ated before the ACNP as well as the restriction on ClusterGroup cannot be deleted if still referred to by an ACNP. Signed-off-by: Yang Ding <dingyang@vmware.com>
Co-authored-by: abhiraut <rauta@vmware.com> Signed-off-by: abhiraut <rauta@vmware.com>
9fea7ce
to
504f6b5
Compare
|
/test-all |
| // members. Since this process is async, there's no guarantee which happens first. The drawback is | ||
| // that all ACNPs that refers to this CG directly will be synchronously processed once as part of | ||
| // deletion handler, which could cause perf issue in scale. | ||
| c.triggerCNPUpdates(og) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call is unnecessary.
If we want to rely on syncInternalGroup for ACNP syncs, the internal group needs to be deleted from the internal store after syncInternalGroup.
I don't think we need to defer deleting internal group to syncInternalGroup. triggerCNPUpdates actually just requires the group name. The association is stored in CNP store, instead of group store.
However, at the time of ACNP sync, we also want to ensure that the group is deleted from internal store already, so that we don't include outdated members. Since this process is async, there's no guarantee which happens first.
It doesn't matter whether the group is deleted from internal store already when processing ACNP. The only thing we need to ensure is triggering CNP update after syncInternalGroup, so that its data will be correct eventually.
Of course, we need to ensure c.triggerParentGroupSync() and c.triggerCNPUpdates() are called in syncInternalGroup regardless of the existence of the internal group. deferred calls can work. I don't think they really need to return error.
func (c *NetworkPolicyController) syncInternalGroup(key string) error {
defer c.triggerCNPUpdates(key)
defer c.triggerParentGroupSync(key)
// Retrieve the internal Group corresponding to this key.
...
}
| if err != nil { | ||
| klog.Errorf("Unable to delete internal Group %s from store: %v", key, err) | ||
| } | ||
| c.triggerParentGroupSync(grp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as triggerCNPUpdates. Calling it in syncInternalGroup should be good enough. The method doesn't really require the whole group but just the group name.
|
thanks for the review Quan! let me evaluate and push a change based on that |
Signed-off-by: abhiraut <rauta@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one nit
| parentGroupObjs, err := c.internalGroupStore.GetByIndex(store.ChildGroupIndex, grp) | ||
| if err != nil { | ||
| klog.Errorf("Error retrieving parents of ClusterGroup %s: %v", grp, err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Not introduced by this PR, but it should return earlier upon error, just like triggerCNPUpdates, though the following processing is safe to run in this particular case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: abhiraut <rauta@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
/test-all |
This is the second of two PRs that relaxes admission validations for ClusterGroup references. This PR removes the restriction that a ClusterGroup must exist before it can be referred to in Antrea ClusterNetworkPolicies. It also allows a ClusterGroup to be deleted even if it is still referred to by an ACNP.
This PR also refactors the way test resources are created in testSteps of antreapolicy_test.go. Since resource creation/deletion dependencies are no longer enforced, e2e testcases can simply use an ordered list of test resources specification to test different scenarios (i.e. ACNP is created before its referred CG and vice versa).