Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS patch and minor workload cluster upgrade rollback #221

Closed
alex-dabija opened this issue Oct 26, 2020 · 3 comments
Closed

AWS patch and minor workload cluster upgrade rollback #221

alex-dabija opened this issue Oct 26, 2020 · 3 comments
Assignees
Labels
kind/story provider/aws Related to cloud provider Amazon AWS team/firecracker

Comments

@alex-dabija
Copy link

alex-dabija commented Oct 26, 2020

User Story

As a customer, I want to have the option to rollback patch and minor workload cluster upgrades in order to mitigate the risk of having a cluster in an invalid state after an upgrade.

Background

The risk of having an upgraded cluster in an invalid state is increasing as we are doing more and more upgrades. At the moment, rollbacks are disabled and in case an upgrade goes bad the only option is to fix the problem.

@xh3b4sd
Copy link

xh3b4sd commented Nov 10, 2020

I was discussing this internally with Fer after I could not find anything blocking this in the deprecated api. We agreed this should rather be done on the Control Plane to make it also work in the future. Though as discussed in planning this means we want to move everything over from OPA to our own admission controller first. So we may want to put this on the back burner until we have our own rules under control.

@marians marians changed the title AWS patch and minor tenant cluster upgrade rollback AWS patch and minor workload cluster upgrade rollback Jan 14, 2021
@pipo02mix
Copy link
Contributor

We discussed in WG Successful Upgrades, it was block by OPA migration and now according to ALex it is ready to go

@xh3b4sd
Copy link

xh3b4sd commented Feb 10, 2021

We can downgrade clusters now. Below is a test run from our conformance tests, downgrading a cluster from v13.1.0 to v13.0.0.

$ export AWSCNFM_CREATE_RELEASEVERSION=v13.1.0
$ export AWSCNFM_UPDATE_RELEASEVERSION=v13.0.0
$ awscnfm plan pl003
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `create cluster defaultcontrolplane`","time":"2021-02-10T09:26:33.698461+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/cmd/action/create/cluster/defaultcontrolplane/runner.go:75","level":"info","message":"creating crs for tenant cluster 4c538","time":"2021-02-10T09:26:34.19388+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `create nodepool defaultdataplane`","time":"2021-02-10T09:26:34.520248+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `verify cluster created`","time":"2021-02-10T09:26:34.916969+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `verify master ready`","time":"2021-02-10T09:44:15.798355+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `verify worker ready`","time":"2021-02-10T09:44:16.389093+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `update cluster minor`","time":"2021-02-10T09:44:16.920021+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `verify cluster updated`","time":"2021-02-10T09:44:17.384363+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `delete cluster`","time":"2021-02-10T10:04:18.559359+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:68","level":"info","message":"executing action `verify cluster deleted`","time":"2021-02-10T10:04:18.845921+00:00"}
{"caller":"github.com/giantswarm/awscnfm/v12/pkg/plan/executor.go:105","level":"error","message":"failed executing action `verify cluster deleted`","time":"2021-02-10T11:34:23.324357+00:00"}

Cluster deletion failed in the test due to some app-operator bug. Talked with Team Batman. They already fixed it.

@xh3b4sd xh3b4sd closed this as completed Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/story provider/aws Related to cloud provider Amazon AWS team/firecracker
Projects
None yet
Development

No branches or pull requests

3 participants