Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix racy FrameworkVersion and Instance reconciliation #363

Merged
merged 6 commits into from
Jun 21, 2019

Conversation

zen-dog
Copy link
Contributor

@zen-dog zen-dog commented Jun 20, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/component generator
/component kudoctl

/component operator

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind feature
/kind enhancement
/kind infrastructure
/kind kep

What this PR does / why we need it:
Currently, there is a race condition when applying FrameworkVersion and Instance CRDs. The second one relies on the first one to exist for the reconciliation. However, there is no "happened-before" guarantee. To fix this, instance_controller now watches the FrameworkVersion too and reconciles accordingly.

Which issue(s) this PR fixes:

Fixes #274

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@zen-dog zen-dog requested a review from runyontr June 20, 2019 10:01
@zen-dog zen-dog changed the title Ad/fix racy fv instance Fix racy FrameworkVersion and Instance reconciliation Jun 20, 2019
First draft:
 - `intance_controller` now watches for `FrameworkVersion` events and queues all corresponding instances for reconciliation
defer c.Delete(context.TODO(), frameworkVersion)

log.Printf("Then its instances are reconciled")
g.Eventually(requests, timeout).Should(gomega.Receive(gomega.Equal(expectedRequest)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also make sure that a deploy PlanExecution gets created. Since the Deploy plan only gets triggered when the Instance objects gets updated, it might not be enough to just reconcile the object.

https://github.com/kudobuilder/kudo/blob/master/pkg/controller/instance/instance_controller.go#L69

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good catch. Because the existing instance hasn't changed, reconciliation won't trigger an execution plan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it quite strange for a predicate to have side-effects (i.e., to create a PlanExecution). Is this a common/right pattern?

Looking at the Kubebuilder examples, I got the impression that predicates should just inspect events and decide whether the reconciler should take a look at them. But that the reconciler is responsible for making any changes.

It also seems to be a bad practice to look at the status in order to decide whether to apply any changes:

Remember, status should be able to be reconstituted from the state of the world, so it's generally not a good idea to read from the status of the root object. Instead, you should reconstruct it every run. That's what we'll do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find it quite strange for a predicate to have side-effects (i.e., to create a PlanExecution).

I agree, however, current logic compares old instance spec with the new one (e.g. for new/updated parameters) and one has access to the old and new object in the predicate but not in the reconciliation function.

@jbarrick-mesosphere
Copy link
Member

jbarrick-mesosphere commented Jun 20, 2019

I wrote in a comment on #361 (comment), but thought I'd include it here too:

This integration test can be written using the declarative test harness. Create a test case with three steps:

  1. Create Instance object.
  2. Create Framework and FrameworkVersion object and assert that the Instance's objects are created successfully.

Documentation is here before it's merged: https://github.com/kudobuilder/kudo/blob/e6afeb0e16e257cc9a2a0b6ca542b55732a10c40/docs/testing.md

pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller.go Outdated Show resolved Hide resolved
pkg/controller/instance/instance_controller_test.go Outdated Show resolved Hide resolved
defer c.Delete(context.TODO(), frameworkVersion)

log.Printf("Then its instances are reconciled")
g.Eventually(requests, timeout).Should(gomega.Receive(gomega.Equal(expectedRequest)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it quite strange for a predicate to have side-effects (i.e., to create a PlanExecution). Is this a common/right pattern?

Looking at the Kubebuilder examples, I got the impression that predicates should just inspect events and decide whether the reconciler should take a look at them. But that the reconciler is responsible for making any changes.

It also seems to be a bad practice to look at the status in order to decide whether to apply any changes:

Remember, status should be able to be reconstituted from the state of the world, so it's generally not a good idea to read from the status of the root object. Instead, you should reconstruct it every run. That's what we'll do here.

Aleksey Dukhovniy and others added 2 commits June 21, 2019 11:59
Review suggestions

Co-Authored-By: Gastón Kleiman <gaston@apache.org>
when a `FrameworkVersion` is **created** and there are instances without an active execution plan.
return requests
})

if err = c.Watch(&source.Kind{Type: &kudov1alpha1.FrameworkVersion{}}, &handler.EnqueueRequestsFromMapFunc{ToRequests: mapFn}); err != nil {
// This map function makes sure that we *ONLY* handle created frameworkVersion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings up a bigger question that's out of scope for here:

IF someone updates a FV, what do we do with the instances?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of implications here. The instance might drastically change on the next plan run, even if we didn't do anything. Maybe we need to serialize the current FV to the instance.Status or something, idk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the same race is certainly possible with FV+I update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FrameworkVersions and Instances can race
5 participants