New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSSM-2190 OSSM-2232 OSSM-2189 Don't reconcile invalid objects #1055
Conversation
|
||
if member.Name != common.MemberName { | ||
return reconcile.Result{}, r.reportError(ctx, member, maistrav1.ConditionReasonMemberNameInvalid, | ||
fmt.Errorf("the ServiceMeshMember name is invalid; must be 'default'")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Use a constant for this and tests usages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely! Thanks!
if roll.Name != common.MemberRollName { | ||
log.Info("Skipping reconciliation of SMMR with invalid name", "project", meshNamespace, "name", roll.Name) | ||
setReadyCondition(roll, false, maistrav1.ConditionReasonInvalidName, | ||
"the ServiceMeshMemberRoll name is invalid; must be 'default'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: constant as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
for _, smcp := range smcpList.Items { | ||
if earliest == nil || | ||
smcp.CreationTimestamp.Before(&earliest.CreationTimestamp) || | ||
(smcp.CreationTimestamp == earliest.CreationTimestamp && smcp.Name < earliest.Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case here? Is this even possible? (to have two equal timestamps for the same resource)
Why the name is important here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the name is important here?
I think that when timestamps are equal, you can't determine which one should be reconciled, so name is an additional parameter for selecting the right object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but name should not matter. alphabetical order doesn't mean anything. Or does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jewertow is right. The resolution of creationTimestamp
is 1 second, so you can easily have two SMCPs with the same timestamp. To make sure that the code always selects the same SMCP as the "earliest", I "sort" them by their names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, why not use random() instead of going alphabetical. What if 1st name is b
and second name is a
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, random() is a joke... I'm just trying to understand why we have this comparison at all. If timestamps are equal, pick any of those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't rely on resourceVersion for this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which SMCP gets reconciled must be deterministic. If we were to choose which SMCP to reconcile at random, the controller could reconcile both SMCPs or neither.
Think about what happens when we have two SMCPs with the same timestamp.
- The controller is asked to reconcile SMCP1. It chooses the earliest SMCP at random. Let's say it chooses SMCP2 this time, so it skips reconciliation of SMCP1 and adds the error to its status.
- The controller is then asked to reconcile SMCP2. This time, it chooses SMCP1 as the earliest, so it skips reconciliation of SMCP2 and adds the error to it.
Neither SMCP gets reconciled. In another scenario, the controller could reconcile both SMCPs.
That's why we need to sort by name (or some other field), so that the same SMCP is always chosen as the one to reconcile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resourceVersion
is supposed to be opaque, meaning that we shouldn't rely on its value. All we're allowed to do is check if two resourceVersions are equal.
It's possible that the format of resourceVersion
will change in the future. It can already be different if you use a datastore other than etcd.
smcp.CreationTimestamp.Before(&earliest.CreationTimestamp) || | ||
(smcp.CreationTimestamp == earliest.CreationTimestamp && smcp.Name < earliest.Name) { | ||
goPitfallWorkaround := smcp | ||
earliest = &goPitfallWorkaround |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? :o
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Go sucks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this change manually on OCP and for invalid SMM and SMMR everything works fine, but I'm not sure if for SMCP as well. I applied a valid SMCP and then another one in the same namespace, and it was not reconciled as expected. But then I removed the invalid SMCP and both were removed, so I lost my valid deployments.
When you remove the second SMCP, the first one does not get removed, but all its child resources are indeed deleted by the pruner. That's because the label selector that the pruner uses to delete resources only considers the SMCP namespace; it ignores the SMCP name. As this is a different bug, I've opened a new issue for it. Thanks @jewertow! |
Ok, thanks for the confirmation. |
These three commits ensure that the three controllers (SMCP, SMMR, and SMM) don't reconcile objects that shouldn't even exist.
The validation webhook prevents the creation of these objects, but you can create them if you first delete the webhook. If this happens, the controller must ignore them.