Waiting For a Plan to Finish #1461

kensipe · 2020-04-10T19:09:04Z

@anthonydahann a community member made the first PR to introduce --wait to install.

As outlined in the issue #1418 there is a desire to move this code into a reusable space (not the CLI) with a request to add it into kudoClient. This enables the ability to use kudo as a library from terraform in particular for https://github.com/kudobuilder/terraform-provider-kudo. WaitForInstance in kudo.go now provides this ability.

After modifying kudo install to use this wait, it made sense to add a wait-timeout for client control. This is super important as it is very unclear how long a plan will take to "finish".

The challenge from a user perspective at that point is what if the timeout expired and you want to wait again... or what if you didn't wait but now you want to. It just made sense to add kudo plan wait with a wait-time as well. The kudo plan wait works for any plan... it will wait for whatever the active plan is to finish.

To use this new feature:

# precondition
go run cmd/kubectl-kudo/main.go install mysql

# then 
go run cmd/kubectl-kudo/main.go plan wait --instance mysql-instance

New feature in plan submenu:

 go run cmd/kubectl-kudo/main.go plan --help
The plan command has subcommands to view all available plans.

Usage:
  kubectl-kudo plan [command]

Available Commands:
  history     Lists history to a specific operator-version of an instance.
  status      Shows the status of all plans to an particular instance.
  trigger     Triggers a specific plan on a particular instance.
  wait        Waits on a plan to finish for a particular instance.

help for plan wait

 go run cmd/kubectl-kudo/main.go plan wait --help
Waits on a plan to finish for a particular instance.

Usage:
  kubectl-kudo plan wait [flags]

Examples:
  # Wait on the current plan status to finish
  kubectl kudo plan wait --instance=<instanceName>

Fixes #1418

Tagging @anthonydahanne in case he wanted to see this work

Signed-off-by: Ken Sipe <kensipe@gmail.com> working version Signed-off-by: Ken Sipe <kensipe@gmail.com>

…ackage Signed-off-by: Ken Sipe <kensipe@gmail.com>

Signed-off-by: Ken Sipe <kensipe@gmail.com>

kensipe · 2020-04-11T20:35:04Z

I received some feedback questioning the newly added cli command kudo plan wait. If you modify an instance and immediately "wait" on that change... it is possible that the manager didn't see the change yet and the wait will complete before the plan is invoked... It was deemed by this feedback as not worth adding in. A couple of thoughts around this for a debate:

the plan status has the same issue in that if you do a plan status, it could be the "status" prior to the plan.. repeated hits against plan status are necessary
I still see value in an admin installing an operator and a dev or someone wanting to use it wanting to know when the deploy is done... thus the installer doesn't want to wait but another user does.
IMO it is worth having... and worth documenting this situation for awareness... what do other reviewers, teammates and community think?

alenkacz · 2020-04-13T09:39:51Z

@kensipe This is solvable with the admission webhook as that already marks Instance into a state where it's apparent that a plan will be run and it does not have to wait for controller to see it. Potentially without webhook, using generation and/or version of instance could be a solution but the wait then would have to be something like kudo plan trigger --wait for that to work (you need version/generation before and after execution)

Signed-off-by: Ken Sipe <kensipe@gmail.com>

kensipe · 2020-04-13T14:02:29Z

we currently have a solution when a plan is triggered / update or upgrade... the issue mentioned is the inability to have this information if a plan is trigger and another user (or same user later in time) wants to wait... for which we don't have the knowledge of the previous instance at that time.

kensipe · 2020-04-13T14:07:25Z

just updated with a new wait on plan status... super cool IMO.

The plan status keeps refreshing in place on the terminal until the plan is complete (if --wait is used). I added an elapsed time because it can look like the screen is locked when it is not. The last update on the status is the last change to the plan by kudo manager.

The plan status without wait works the same. With --wait, it will loop until: 1) the user exits the process with a ctl+c break or 2) the plan completes.

go run cmd/kubectl-kudo/main.go plan status --instance mysql-instance --wait
Plan(s) for "mysql-instance" in namespace "default":
.
└── mysql-instance (Operator-Version: "mysql-0.2.0" Active-Plan: "deploy")
    ├── Plan backup (serial strategy) [NOT ACTIVE]
    │   └── Phase backup (serial strategy) [NOT ACTIVE]
    │       ├── Step pv [NOT ACTIVE]
    │       ├── Step backup [NOT ACTIVE]
    │       └── Step cleanup [NOT ACTIVE]
    ├── Plan deploy (serial strategy) [COMPLETE], last updated 2020-04-13 08:56:40
    │   └── Phase deploy (serial strategy) [COMPLETE]
    │       ├── Step deploy [COMPLETE]
    │       ├── Step init [COMPLETE]
    │       └── Step cleanup [COMPLETE]
    └── Plan restore (serial strategy) [NOT ACTIVE]
        └── Phase restore (serial strategy) [NOT ACTIVE]
            ├── Step restore [NOT ACTIVE]
            └── Step cleanup [NOT ACTIVE]

elapsed time 6.086202815s✔

This is harder to see in a static image... so I create a video to show it off. (this is the feature I always wanted!)

https://drive.google.com/file/d/1xp-Eax1XtmAKp9eEtpaUnqDcFU8U5-Dj/view?usp=sharing

gerred · 2020-04-13T21:17:34Z

pkg/kudoctl/cmd/plan/plan_status.go

-		return fmt.Errorf("OperatorVersion %s from instance %s/%s does not exist", instance.Spec.OperatorVersion.Name, ns, options.Instance)
-	}
+	// for loop breaks if Wait==false, or when active plan completes (or when user exits process)
+	for {


A lot of this logic is really complex and there's some great Go constructs for doing this nicely. I'd prefer we do this with channels and select, which was well-designed for this and can be a lot more clear. Check this out:

https://www.sohamkamani.com/golang/2018-06-17-golang-using-context-cancellation/

Put the logic into a goroutine that takes a channel, and then select over that and a time.After(options.WaitTime * time.Second). The goroutine can use a for loop with time.Sleep (and then breaks), but there's also time.Ticker depending on what you want to do.

zen-dog · 2020-04-14T12:33:05Z

The challenge from a user perspective at that point is what if the timeout expired and you want to wait again

I'm not sure how important this use case is. The initial issue #966 is the 80/20 use case: wait for the instance update (or triggered plan) to finish. Doing that with two commands is inconvenient but more importantly racy (as @alenkacz mentioned above). While "the plan status has the same issue in that if you do a plan status, it could be the "status" prior to the plan" statement is correct in itself, a plan status command is a snapshot in time: it can happen before, during or after and has no expectations to reflect a previously started plan. This is clearly not the case when --waiting for an Instance update.

Tech debt introduced by the original #966 PR is still present and I suspect that the users will run into raciness issues when using this command after updating instances and triggering plans. Additionally, it is debatable whether plan wait use case is worth added complexity.

kensipe · 2020-04-15T16:17:49Z

Now that we have plan status --wait I'm not sure I see the value in plan wait and since it is questionable perhaps we should remove it.

There is a "raciness" from a cli perspective regarding plan status --wait and plan wait which is understandable and should be documented but doesn't diminish the value of the feature IMO. From the CLI perspective, there isn't a way to be aware that a change has been requested but it hasn't been identified by the controller yet.

There is NO tech debt on #966 however... the imagined race condition does not exist with one small exception... if there is an uninstall and a rapid re-install with the same instance name... it is possible that the stale state of the previous instance isn't clean up yet.

It is also important to note... that unlike a manager race condition... there is no ill side-effect of a racy condition. There is the potential of a confused user and poor UX. For users that know what they are doing it is still a worthy feature.

kensipe added 3 commits April 10, 2020 13:04

first version not working

7040faa

Signed-off-by: Ken Sipe <kensipe@gmail.com> working version Signed-off-by: Ken Sipe <kensipe@gmail.com>

added wait-time as a CLI flag for isntall with pass thru to install p…

7654895

…ackage Signed-off-by: Ken Sipe <kensipe@gmail.com>

adding plan wait and better error handling

7500184

Signed-off-by: Ken Sipe <kensipe@gmail.com>

kensipe requested review from alenkacz, gerred, nfnt and zen-dog as code owners April 10, 2020 19:09

kensipe mentioned this pull request Apr 10, 2020

Refactor install / update wait kudobuilder/terraform-provider-kudo#14

Open

kensipe added the release/highlight This PR is a highlight for the next release label Apr 10, 2020

kensipe requested a review from runyontr April 10, 2020 19:12

kensipe added this to the 0.13.0 milestone Apr 10, 2020

waiting on plan status

32d9d66

Signed-off-by: Ken Sipe <kensipe@gmail.com>

gerred requested changes Apr 13, 2020

View reviewed changes

gerred approved these changes Apr 13, 2020

View reviewed changes

kensipe merged commit bf5c090 into master Apr 14, 2020

kensipe added this to Done in KUDO Global via automation Apr 14, 2020

kensipe deleted the ken/instance-wait branch April 14, 2020 02:01

zen-dog mentioned this pull request Apr 15, 2020

kudo update --wait is racy #1465

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Waiting For a Plan to Finish #1461

Waiting For a Plan to Finish #1461

kensipe commented Apr 10, 2020

kensipe commented Apr 11, 2020

alenkacz commented Apr 13, 2020

kensipe commented Apr 13, 2020

kensipe commented Apr 13, 2020

gerred Apr 13, 2020

zen-dog commented Apr 14, 2020

kensipe commented Apr 15, 2020

Waiting For a Plan to Finish #1461

Waiting For a Plan to Finish #1461

Conversation

kensipe commented Apr 10, 2020

kensipe commented Apr 11, 2020

alenkacz commented Apr 13, 2020

kensipe commented Apr 13, 2020

kensipe commented Apr 13, 2020

gerred Apr 13, 2020

Choose a reason for hiding this comment

zen-dog commented Apr 14, 2020

kensipe commented Apr 15, 2020