Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSSM-2190 OSSM-2232 OSSM-2189 Don't reconcile invalid objects #1055

Merged
merged 3 commits into from Nov 15, 2022

Conversation

luksa
Copy link
Contributor

@luksa luksa commented Nov 4, 2022

These three commits ensure that the three controllers (SMCP, SMMR, and SMM) don't reconcile objects that shouldn't even exist.

The validation webhook prevents the creation of these objects, but you can create them if you first delete the webhook. If this happens, the controller must ignore them.


if member.Name != common.MemberName {
return reconcile.Result{}, r.reportError(ctx, member, maistrav1.ConditionReasonMemberNameInvalid,
fmt.Errorf("the ServiceMeshMember name is invalid; must be 'default'"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Use a constant for this and tests usages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! Thanks!

if roll.Name != common.MemberRollName {
log.Info("Skipping reconciliation of SMMR with invalid name", "project", meshNamespace, "name", roll.Name)
setReadyCondition(roll, false, maistrav1.ConditionReasonInvalidName,
"the ServiceMeshMemberRoll name is invalid; must be 'default'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: constant as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks.

for _, smcp := range smcpList.Items {
if earliest == nil ||
smcp.CreationTimestamp.Before(&earliest.CreationTimestamp) ||
(smcp.CreationTimestamp == earliest.CreationTimestamp && smcp.Name < earliest.Name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case here? Is this even possible? (to have two equal timestamps for the same resource)
Why the name is important here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the name is important here?

I think that when timestamps are equal, you can't determine which one should be reconciled, so name is an additional parameter for selecting the right object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but name should not matter. alphabetical order doesn't mean anything. Or does it?

Copy link
Contributor Author

@luksa luksa Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jewertow is right. The resolution of creationTimestamp is 1 second, so you can easily have two SMCPs with the same timestamp. To make sure that the code always selects the same SMCP as the "earliest", I "sort" them by their names.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, why not use random() instead of going alphabetical. What if 1st name is b and second name is a?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, random() is a joke... I'm just trying to understand why we have this comparison at all. If timestamps are equal, pick any of those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't rely on resourceVersion for this, right?

Copy link
Contributor Author

@luksa luksa Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which SMCP gets reconciled must be deterministic. If we were to choose which SMCP to reconcile at random, the controller could reconcile both SMCPs or neither.

Think about what happens when we have two SMCPs with the same timestamp.

  1. The controller is asked to reconcile SMCP1. It chooses the earliest SMCP at random. Let's say it chooses SMCP2 this time, so it skips reconciliation of SMCP1 and adds the error to its status.
  2. The controller is then asked to reconcile SMCP2. This time, it chooses SMCP1 as the earliest, so it skips reconciliation of SMCP2 and adds the error to it.

Neither SMCP gets reconciled. In another scenario, the controller could reconcile both SMCPs.

That's why we need to sort by name (or some other field), so that the same SMCP is always chosen as the one to reconcile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resourceVersion is supposed to be opaque, meaning that we shouldn't rely on its value. All we're allowed to do is check if two resourceVersions are equal.

It's possible that the format of resourceVersion will change in the future. It can already be different if you use a datastore other than etcd.

smcp.CreationTimestamp.Before(&earliest.CreationTimestamp) ||
(smcp.CreationTimestamp == earliest.CreationTimestamp && smcp.Name < earliest.Name) {
goPitfallWorkaround := smcp
earliest = &goPitfallWorkaround
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? :o

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Go sucks.

Copy link
Member

@jewertow jewertow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this change manually on OCP and for invalid SMM and SMMR everything works fine, but I'm not sure if for SMCP as well. I applied a valid SMCP and then another one in the same namespace, and it was not reconciled as expected. But then I removed the invalid SMCP and both were removed, so I lost my valid deployments.

@luksa
Copy link
Contributor Author

luksa commented Nov 15, 2022

But then I removed the invalid SMCP and both were removed, so I lost my valid deployments.

When you remove the second SMCP, the first one does not get removed, but all its child resources are indeed deleted by the pruner. That's because the label selector that the pruner uses to delete resources only considers the SMCP namespace; it ignores the SMCP name. As this is a different bug, I've opened a new issue for it.

Thanks @jewertow!

@jewertow
Copy link
Member

Ok, thanks for the confirmation.

@maistra-bot maistra-bot merged commit cd3ab52 into maistra:maistra-2.3 Nov 15, 2022
@luksa luksa deleted the OSSM-2190 branch November 13, 2023 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants