MM-14999: Installation API #15

lieut-data · 2019-05-11T03:19:54Z

This changelist introduces the CLI API, REST API, store updates and
supervisor additions to manage installations and cluster installations.
Created installations attempt to schedule a cluster installation on the
first available cluster. No attempt is currently made to wait for the cluster installation to actually stabilize, but I anticipate integrating that with pending changes from @gabrieljackson.

While the API remains async as previously changed, I've ripped out the
in-process parallelism in an attempt to simplify the data model
surrounding locking. This will need to be revisited when we scale the
provisioning server horizontally, but for now I haven't been able to come up with a straightforward strategy that avoids lock contention. Very open to suggestions!

This changelist introduces the CLI API, REST API, store updates and supervisor additions to manage installations and cluster installations. Created installations attempt to schedule a cluster installation on the first available cluster. While the API remains async as previously changed, I've ripped out the in-process parallelism in an attempt to simplify the data model surrounding locking. This will need to be revisited when we scale the provisioning server horizontally.

jwilander

That's a lot of code but I think I internalized what's happening pretty well. Looks good for the most part, just a few comments

internal/api/cluster_installation.go

internal/api/installation.go

jwilander · 2019-05-13T15:37:59Z

internal/supervisor/installation.go

+		return
+	}
+
+	logger.Debugf("Transitioned installation from %s to %s", oldState, newState)


I wonder if this would be good to log at INFO level rather than DEBUG or would that be too spammy?

I personally found this a little spammy, and tried to make sure that the INFO-level logs capture all necessary detail for still being able to understand the meaningful changes here. Propose we run debug enabled in production anyway, but filter out downstream.

gabrieljackson

That's a lot of code! :) Nice functionality boost here.

So, you are right that some in-flight PRs in operator will be very handy for long-term integration functionality with CRs, but we have a good way to stub it out right now (which is exactly what you did).

Left a few comments, but it is looking good!

gabrieljackson · 2019-05-14T03:41:34Z

cmd/cloud/installation.go

+
+var installationCreateCmd = &cobra.Command{
+	Use:   "create",
+	Short: "Create an installation.",


I am not advocating for a change at this time, but I am worried about our double(triple?) usage of the word installation. Maybe I am just missing something that would help me build up the innate meaning of cluster installation vs. installation, but I am finding it to be a bit of a struggle.

We may want to see if some other verbiage works better in the future.

I agree that installation as a concept is likely overkill given our current plans to make them 1:1 with a cluster installation. My hope was that by forcing the concept of a multi-cluster installation on the domain, we'd avoid painting ourselves into a naming corner that would be even more confusing in the future. Very happy to iterate on this, though!

gabrieljackson · 2019-05-14T03:44:01Z

cmd/cloud/installation.go

+
+var installationUpgradeCmd = &cobra.Command{
+	Use:   "upgrade",
+	Short: "Upgrade (or downgrade) the version of Mattermost.",


Do we actually want to support downgrades? I don't have enough experience to know the pros and cons of this decision.

5/5 that we should support downgrades: otherwise there is no way to rollback an installation gone awry. The current plan is to let the operator decide how far back to allow such a downgrade -- e.g. we may not support downgrading across major versions.

cmd/cloud/installation.go

internal/provisioner/kops_provisioner.go

gabrieljackson · 2019-05-14T04:10:01Z

internal/supervisor/installation.go

+		deletingClusterInstallations++
+	}
+
+	logger.Debugf(


Good place for WithFields() in my opinion.

2/5, but I'm not sure fields are fungible for sprintf-statements in all cases. Certainly we'd want structured logging for pretty much all common verbiage (e.g. the cluster and installation we do today), but are we actually looking to "parse" cluster_installation_deleted_count, etc.?

In my head the fields would be no different that what you had: deleting, failed, etc. You would narrow the search down with a second criteria something else in the log message if you just wanted these messages.

We can always implement this later if we feel like it!

Co-Authored-By: Gabe Jackson <gabe@coffeepowered.co>

lieut-data · 2019-05-15T02:17:44Z

Thanks for the careful reviews and feedback, @jwilander & @gabrieljackson! A few comments above, plus some improvements based on the requested feedback :)

jwilander

LGTM 👍

lieut-data added the 2: Dev Review Requires review by a developer label May 11, 2019

lieut-data requested review from jwilander and gabrieljackson May 11, 2019 03:19

jwilander reviewed May 13, 2019

View reviewed changes

gabrieljackson reviewed May 14, 2019

View reviewed changes

lieut-data and others added 3 commits May 14, 2019 22:14

Update cmd/cloud/installation.go

df6084d

Co-Authored-By: Gabe Jackson <gabe@coffeepowered.co>

refactor parsePaging helper method

4c58d89

refactor api/lock.go

f384616

lieut-data requested review from jwilander and gabrieljackson May 15, 2019 02:17

jwilander approved these changes May 15, 2019

View reviewed changes

gabrieljackson approved these changes May 15, 2019

View reviewed changes

lieut-data added 4: Reviews Complete All reviewers have approved the pull request and removed 2: Dev Review Requires review by a developer labels May 16, 2019

lieut-data merged commit 4fb9099 into master May 16, 2019

lieut-data deleted the mm-14999-installation-api branch May 16, 2019 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MM-14999: Installation API #15

MM-14999: Installation API #15

lieut-data commented May 11, 2019

jwilander left a comment

jwilander May 13, 2019

lieut-data May 15, 2019

gabrieljackson left a comment

gabrieljackson May 14, 2019

lieut-data May 15, 2019

gabrieljackson May 14, 2019

lieut-data May 15, 2019

gabrieljackson May 14, 2019

lieut-data May 15, 2019

gabrieljackson May 15, 2019

lieut-data commented May 15, 2019

jwilander left a comment

MM-14999: Installation API #15

MM-14999: Installation API #15

Conversation

lieut-data commented May 11, 2019

jwilander left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabrieljackson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lieut-data commented May 15, 2019

jwilander left a comment

Choose a reason for hiding this comment