Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcd v3 Support #3176

Merged
merged 2 commits into from
Aug 12, 2017
Merged

Etcd v3 Support #3176

merged 2 commits into from
Aug 12, 2017

Conversation

gambol99
Copy link
Contributor

@gambol99 gambol99 commented Aug 9, 2017

Etcd V3 Support

The current implementation is running v2.2.1 which is two years old and end of life. This PR adds the ability to use etcd v3 and set the versions if required. Note at the moment the image is still using the gcr.io registry image and much like Etcd TLS PR there presently is no 'automated' migration path from v2 to v3.

  • the feature is gated behind the version of the etcd cluster, both clusters events and main must use the same storage type
  • the version for v2 is unchanged and pinned at v2.2.1 with v3 using v3.0.17
  • @question: we should consider allowing the user to override the images though I think this should be addressed generically, than one offs here and then. I know @chrislovecnm is working on a asset registry??

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 9, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @gambol99. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 9, 2017
@gambol99 gambol99 changed the title Etcv3 Support Etcd v3 Support Aug 9, 2017
@gambol99
Copy link
Contributor Author

gambol99 commented Aug 9, 2017

/assign @justinsb

@@ -181,6 +181,7 @@ type ProtokubeFlags struct {
DNSInternalSuffix *string `json:"dnsInternalSuffix,omitempty" flag:"dns-internal-suffix"`
DNSProvider *string `json:"dnsProvider,omitempty" flag:"dns"`
DNSServer *string `json:"dns-server,omitempty" flag:"dns-server"`
Image *string `json:"image,omitempty" flag:"image"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the etcd image, so how about EtcdImage *string json:"image,omitempty" flag:"etcd-image"`. It's the flag that is hard to change and matters most.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense and fixed ... image is too generic as well

@@ -195,9 +196,14 @@ type ProtokubeFlags struct {

// ProtokubeFlags is responsible for building the command line flags for protokube
func (t *ProtokubeBuilder) ProtokubeFlags(k8sVersion semver.Version) *ProtokubeFlags {
// @todo: i think we should allow the user to override the source of the image, but for now
// lets keep that for another PR and allow the version change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! @chrislovecnm has some PRs that are working towards this.

// EnableEtcdTLS indicates the etcd service should use TLS between peers and clients
EnableEtcdTLS bool `json:"enableEtcdTLS,omitempty"`
// StorageType indicates the storage type of the cluster v2 or v3
StorageType EtcdStorageType `json:"storageType,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to / should specify storageType separately from the etcd version. I think this ends up controlling the apiserver storage-type flag (there is a separate storage-version for etcd, as well as a serialization formation for apiserver, but I think those were measures to provide rollback functionality which aren't widely used - a backup is the way to go).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

admittedly this was copied from a previous PR by yourself ... I think your right though, there's no real need to specify storage engine outside of the version .. I suppose if or when kubernetes supports different storage backend's it can be amended

// @question: I kinda feel like this functionality should be on the kops.CluserSpec itself, though I don't
// know how to do this without duplicating methods and how it effects the versions v1, v2 etc
func UseEtcdV3(spec *kops.ClusterSpec) bool {
// validation ensure the etcd must be both one or the other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would probably parse the version, and check if the prefix was v2. / 2. or v3. / 3.. I suppose there is also the chance that when we mirror the image we will put a prefix on it (e.g. gcr.io/myimages/etcd_v3.0.17) but I think we don't do that yet.

Specifying the storage-type separately is sort of appealing here, except that it isn't really the storage-type we're specifying, and as you say we would rather this was on the cluster/apiserver. The flag is mapped on the apiserver (you set StorageBackend), so we could let the user set it there if they really want to.

I suspect we keep it simple for now, don't add storage-type, parse the version to infer the etcd version using some simple heuristics, and see what is needed from there.

And that implies that we replace validateEtcdStorage with a similar loop, but checking that the inferred version of etcd is the same (which also lets us alert the user if they specify a version we don't recognize)

if x.Version == "" {
switch UseEtcdV3(spec) {
case true:
x.Version = "3.0.17"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want this logic, except I think we'll want to default new clusters to etcd3 in some future kubernetes version.

@@ -82,7 +82,7 @@ func run() error {
flag.StringVar(&tlsKey, "tls-key", tlsKey, "Path to a file containing the private key for etcd server")
flags.StringSliceVarP(&zones, "zone", "z", []string{}, "Configure permitted zones and their mappings")
flags.StringVar(&dnsProviderID, "dns", "aws-route53", "DNS provider we should use (aws-route53, google-clouddns, coredns)")
flags.StringVar(&etcdImageSource, "etcd-image-source", etcdImageSource, "Etcd Source Container Registry")
flags.StringVar(&etcdImageSource, "image", "gcr.io/google_containers/etcd:2.2.1", "Etcd Source Container Registry")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest etcd-image, but 👍

@@ -289,6 +289,7 @@ func (c *populateClusterSpec) run() error {
switch m {
case "config":
// Note: DefaultOptionsBuilder comes first
codeModels = append(codeModels, &components.EtcdOptionsBuilder{Context: optionsContext})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put this after DefaultOptionsBuilder, just so we don't have to change the comment :-)

@justinsb
Copy link
Member

Generally looks great. I did have some minor comments, but the big one is about whether we expose storage-type on the EtcdCluster. I think we could infer it from the version. My inclination is to do that for now, and then we can add more options later. It's tricky though because there are 3 concepts that are all intertwined - the storage-engine that etcd uses, the protocol that the apiserver uses to talk to etcd, and the serialization format that the apiserver uses to serialize objects when writing them to etcd3. I suspect we want to just move all 3 at once now that they've mostly been proven out - the benefits of not being an early etcd3 adopter!

@justinsb
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 11, 2017
@justinsb
Copy link
Member

But oh the hypocrisy :-) I checked #2169 where I had experimented with this previously, and I had also split storage-type from version. I think I'm right now, but given we both split them, maybe you and past-me are right...

@gambol99
Copy link
Contributor Author

gambol99 commented Aug 11, 2017

i think your present self is right :-) .. the requirement for a storage type really isn't there and we allow the user to set the storage type in the api-server anyhow

@gambol99
Copy link
Contributor Author

hi @justinsb ... i've made the following changes and rebased

Requested Changes - Etvd v3

  • removing the StorageType on the etcd cluster spec (sticking with the Version field only)
  • changed the protokube flag back to -etcd-image
  • users have to explicitly set the etcd version now; the latest version in gcr.io is 3.0.17
  • reverted the ordering on the populate spec

@gambol99
Copy link
Contributor Author

$ kubectl get no 
NAME                                          STATUS         AGE       VERSION
ip-10-250-34-29.eu-west-2.compute.internal    Ready,master   22m       v1.7.0
ip-10-250-34-59.eu-west-2.compute.internal    Ready,master   22m       v1.7.0
ip-10-250-34-6.eu-west-2.compute.internal     Ready,master   22m       v1.7.0
ip-10-250-35-10.eu-west-2.compute.internal    Ready,node     20m       v1.7.0
ip-10-250-35-27.eu-west-2.compute.internal    Ready,node     20m       v1.7.0
ip-10-250-36-178.eu-west-2.compute.internal   Ready,node     20m       v1.7.0
ip-10-250-36-251.eu-west-2.compute.internal   Ready,node     20m       v1.7.0
ip-10-250-36-91.eu-west-2.compute.internal    Ready,node     20m       v1.7.0
$ kubectl --namespace=kube-system get pod | grep etcd
etcd-server-events-ip-10-250-34-29.eu-west-2.compute.internal        1/1       Running   0          21m
etcd-server-events-ip-10-250-34-59.eu-west-2.compute.internal        1/1       Running   0          21m
etcd-server-events-ip-10-250-34-6.eu-west-2.compute.internal         1/1       Running   0          22m
etcd-server-ip-10-250-34-29.eu-west-2.compute.internal               1/1       Running   0          21m
etcd-server-ip-10-250-34-59.eu-west-2.compute.internal               1/1       Running   0          22m
etcd-server-ip-10-250-34-6.eu-west-2.compute.internal                1/1       Running   0          22m
$ kubectl --namespace=kube-system logs etcd-server-ip-10-250-34-29.eu-west-2.compute.internal | grep version
2017-08-11 10:55:56.800257 I | etcdserver: starting server... [version: 3.0.17, cluster version: to_be_decided]
2017-08-11 10:56:07.293765 N | membership: set the initial cluster version to 2.3
2017-08-11 10:56:07.293817 I | api: enabled capabilities for version 2.3
2017-08-11 10:56:15.438148 N | membership: updated the cluster version from 2.3 to 3.0
2017-08-11 10:56:15.438307 I | api: enabled capabilities for version 3.0

nice :-) ... i also verified v2 was unaffected ...

The current implementation is running v2.2.1 which is two year old and end of life. This PR add the ability to use etcd and set the versions if required. Note at the moment the image is still using the gcr.io registry image. As note, much like TLS their presently is not 'automated' migration path from v2 to v3.

- the feature is gated behine the storageType of the etcd cluster, bot clusters events and main must use the same storage type
- the version for v2 is unchanged and pinned at v2.2.1 with v2 using v3.0.17
- @question: we shoudl consider allowing the use to override the images though I think this should be addresses more generically, than one offs here and then. I know chris is working on a asset registry??
- removing the StorageType on the etcd cluster spec (sticking with the Version field only)
- changed the protokube flag back to -etcd-image
- users have to explicitly set the etcd version now; the latest version in gcr.io is 3.0.17
- reverted the ordering on the populate spec
@justinsb
Copy link
Member

/lgtm

Thanks so much @gambol99

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 12, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gambol99, justinsb

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 12, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 7942869 into kubernetes:master Aug 12, 2017
@gambol99 gambol99 deleted the etcv3 branch August 25, 2017 19:41
@chrislovecnm chrislovecnm mentioned this pull request Sep 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants