Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MM-14999: Installation API #15

Merged
merged 4 commits into from
May 16, 2019
Merged

MM-14999: Installation API #15

merged 4 commits into from
May 16, 2019

Conversation

lieut-data
Copy link
Member

This changelist introduces the CLI API, REST API, store updates and
supervisor additions to manage installations and cluster installations.
Created installations attempt to schedule a cluster installation on the
first available cluster. No attempt is currently made to wait for the cluster installation to actually stabilize, but I anticipate integrating that with pending changes from @gabrieljackson.

While the API remains async as previously changed, I've ripped out the
in-process parallelism in an attempt to simplify the data model
surrounding locking. This will need to be revisited when we scale the
provisioning server horizontally, but for now I haven't been able to come up with a straightforward strategy that avoids lock contention. Very open to suggestions!

This changelist introduces the CLI API, REST API, store updates and
supervisor additions to manage installations and cluster installations.
Created installations attempt to schedule a cluster installation on the
first available cluster.

While the API remains async as previously changed, I've ripped out the
in-process parallelism in an attempt to simplify the data model
surrounding locking. This will need to be revisited when we scale the
provisioning server horizontally.
@lieut-data lieut-data added the 2: Dev Review Requires review by a developer label May 11, 2019
Copy link
Member

@jwilander jwilander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of code but I think I internalized what's happening pretty well. Looks good for the most part, just a few comments

internal/api/cluster_installation.go Outdated Show resolved Hide resolved
internal/api/cluster_installation.go Outdated Show resolved Hide resolved
internal/api/installation.go Outdated Show resolved Hide resolved
return
}

logger.Debugf("Transitioned installation from %s to %s", oldState, newState)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be good to log at INFO level rather than DEBUG or would that be too spammy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally found this a little spammy, and tried to make sure that the INFO-level logs capture all necessary detail for still being able to understand the meaningful changes here. Propose we run debug enabled in production anyway, but filter out downstream.

Copy link
Contributor

@gabrieljackson gabrieljackson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of code! :) Nice functionality boost here.

So, you are right that some in-flight PRs in operator will be very handy for long-term integration functionality with CRs, but we have a good way to stub it out right now (which is exactly what you did).

Left a few comments, but it is looking good!


var installationCreateCmd = &cobra.Command{
Use: "create",
Short: "Create an installation.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not advocating for a change at this time, but I am worried about our double(triple?) usage of the word installation. Maybe I am just missing something that would help me build up the innate meaning of cluster installation vs. installation, but I am finding it to be a bit of a struggle.

We may want to see if some other verbiage works better in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that installation as a concept is likely overkill given our current plans to make them 1:1 with a cluster installation. My hope was that by forcing the concept of a multi-cluster installation on the domain, we'd avoid painting ourselves into a naming corner that would be even more confusing in the future. Very happy to iterate on this, though!


var installationUpgradeCmd = &cobra.Command{
Use: "upgrade",
Short: "Upgrade (or downgrade) the version of Mattermost.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want to support downgrades? I don't have enough experience to know the pros and cons of this decision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5/5 that we should support downgrades: otherwise there is no way to rollback an installation gone awry. The current plan is to let the operator decide how far back to allow such a downgrade -- e.g. we may not support downgrading across major versions.

cmd/cloud/installation.go Outdated Show resolved Hide resolved
internal/provisioner/kops_provisioner.go Show resolved Hide resolved
deletingClusterInstallations++
}

logger.Debugf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good place for WithFields() in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2/5, but I'm not sure fields are fungible for sprintf-statements in all cases. Certainly we'd want structured logging for pretty much all common verbiage (e.g. the cluster and installation we do today), but are we actually looking to "parse" cluster_installation_deleted_count, etc.?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my head the fields would be no different that what you had: deleting, failed, etc. You would narrow the search down with a second criteria something else in the log message if you just wanted these messages.

We can always implement this later if we feel like it!

@lieut-data
Copy link
Member Author

Thanks for the careful reviews and feedback, @jwilander & @gabrieljackson! A few comments above, plus some improvements based on the requested feedback :)

Copy link
Member

@jwilander jwilander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@lieut-data lieut-data added 4: Reviews Complete All reviewers have approved the pull request and removed 2: Dev Review Requires review by a developer labels May 16, 2019
@lieut-data lieut-data merged commit 4fb9099 into master May 16, 2019
@lieut-data lieut-data deleted the mm-14999-installation-api branch May 16, 2019 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4: Reviews Complete All reviewers have approved the pull request
Projects
None yet
3 participants