Compare generations when available #325

KnVerey · 2018-08-07T22:23:06Z

Problem

I just observed a instant false positive success on a web deployment. Relevant lines:

[INFO][2018-08-07 19:27:25 +0000]	Successfully deployed in 1.9s: [...] Deployment/web
[INFO][2018-08-07 19:27:25 +0000]	Deployment/web                                    3 replicas, 3 updatedReplicas, 3 availableReplicas

The rollout definitely did not complete in less than 2s.

Checking the code made me notice that we aren't comparing metadata.generation to status.observedGeneration before using the status to determine success. This means we might be making that determination with state data, which could explain the above behaviour.

Solution

Compare metadata.Generation to status.observedGeneration before _succeeded? and _failed? determinations whenever possible. I used the API documentation to figure out which controllers actually populate the observed generation.

Since this is a controller race, I can't think of a way to make it happen for integration tests. I have added regression tests at the unit test level.

@Shopify/cloudx

karanthukral · 2018-08-08T13:36:40Z

lib/kubernetes-deploy/kubernetes_resource/daemon_set.rb

@@ -24,7 +24,8 @@ def deploy_succeeded?
    end

    def deploy_failed?
-      pods.present? && pods.any?(&:deploy_failed?)
+      pods.present? && pods.any?(&:deploy_failed?) &&
+      observed_generation == current_generation


should this not be !=? Or maybe I just misunderstood it

As I understand it, and @KnVerey will hopefully correct me if I'm wrong, current_generation gets bumped when we apply the config. observed_generation gets set by the controller reporting the status. When they don't match it means that we're looking at a status for the previous config not current. E.g. we don't want to say the deployment has failed or succeeded before the status reflects the current config's state.

oh! Thanks for the clarification

karanthukral · 2018-08-08T13:38:54Z

lib/kubernetes-deploy/kubernetes_resource.rb

+
+    def observed_generation
+      return -2 unless exists?
+      # populating this is a best practice, but not all controllers actually do it


Out of curiosity, do you have an example of a controller that doesn't populate it?

Since all objects have metadata generations, theoretically any controller could be setting observedGeneration, though this is only meaningful if the controller is writing a status. An example of a resource that has a status but no observedGeneration would be Job or CronJob. See also this k8s core issue. As a sidenote, in 1.11 (I think it made beta there?), CRs will have a reliable metadata.generation we can use to do this properly in our own controllers. Our generic CR success feature should make use of it when available.

karanthukral

Just the one question for clarification but the rest looks good to me

dturn

Out of curiosity: is there a reasonable way to return early from sync when observed_generation != current_generation. Doesn't seem like we get anything useful when that happens.

KnVerey · 2018-08-08T15:29:41Z

Is there a reasonable way to return early from sync when observed_generation != current_generation. Doesn't seem like we get anything useful when that happens.

What do you have in mind that we could skip? The basic sync is just @instance_data = mediator.get_instance(type, name), and we don't know whether the generations match until we get that request. Maybe this is suggesting an additional state for our tentative state machine implementation though (names not to be taken seriously): undeployed -> (outdated) -> pending -> succeeded/failed/timed out

dturn · 2018-08-08T16:21:10Z

Having to put this check in every class just makes me think we're missing an abstraction layer. The sync method was an attempt but, I think you're right that the state machine is probably the best way forward.

KnVerey · 2018-08-08T17:47:25Z

Yeah, it definitely irked me to write that line in so many places. I could technically add that pseudo-state in our current model like this, but we might get unexpected results unless we also make everywhere that partitions resources into groups aware of the possibility:

WDYT?

dturn · 2018-08-08T17:49:18Z

I think its better as is than adding a new state to unwind later.

KnVerey · 2018-08-08T18:09:47Z

👍 I'll merge as-is then. I added a comment to your WIP so we don't forget about this extra state.

KnVerey requested review from dturn and karanthukral August 7, 2018 22:23

Check generations when available

efba100

KnVerey force-pushed the observed_generation branch from ec2aa2a to efba100 Compare August 7, 2018 22:30

karanthukral reviewed Aug 8, 2018

View reviewed changes

dturn approved these changes Aug 8, 2018

View reviewed changes

karanthukral approved these changes Aug 8, 2018

View reviewed changes

KnVerey mentioned this pull request Aug 8, 2018

Resource deploy status states #303

Closed

KnVerey merged commit 10a0c37 into master Aug 8, 2018

KnVerey deleted the observed_generation branch August 8, 2018 18:10

dturn deployed to rubygems September 18, 2018 19:18 Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare generations when available #325

Compare generations when available #325

KnVerey commented Aug 7, 2018

karanthukral Aug 8, 2018

dturn Aug 8, 2018

karanthukral Aug 8, 2018

karanthukral Aug 8, 2018

KnVerey Aug 8, 2018

karanthukral left a comment

dturn left a comment

KnVerey commented Aug 8, 2018

dturn commented Aug 8, 2018

KnVerey commented Aug 8, 2018

dturn commented Aug 8, 2018

KnVerey commented Aug 8, 2018

Compare generations when available #325

Compare generations when available #325

Conversation

KnVerey commented Aug 7, 2018

Problem

Solution

karanthukral Aug 8, 2018

Choose a reason for hiding this comment

dturn Aug 8, 2018

Choose a reason for hiding this comment

karanthukral Aug 8, 2018

Choose a reason for hiding this comment

karanthukral Aug 8, 2018

Choose a reason for hiding this comment

KnVerey Aug 8, 2018

Choose a reason for hiding this comment

karanthukral left a comment

Choose a reason for hiding this comment

dturn left a comment

Choose a reason for hiding this comment

KnVerey commented Aug 8, 2018

dturn commented Aug 8, 2018

KnVerey commented Aug 8, 2018

dturn commented Aug 8, 2018

KnVerey commented Aug 8, 2018