Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow changing AZ of masters #9017

Merged
merged 2 commits into from
May 1, 2020
Merged

Conversation

olemarkus
Copy link
Member

Fixes #8817

I have not done any testing on terraform though.
Care still needs to be taken when deleting master IGs. If you break quorum you need to restore etcd from backup. Preventing master IGs in an unsafe way is already possible though, so I didn't do anything to prevent this here.

* Ensure every master runs etcd
* Make it possible to remove masters
* "Cross" Validate on IG creation
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 28, 2020
}
}
if !hasEtcd {
allErrs = append(allErrs, field.NotFound(field.NewPath("spec", "etcdClusters").Key(etcd.Name), g.ObjectMeta.Name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
allErrs = append(allErrs, field.NotFound(field.NewPath("spec", "etcdClusters").Key(etcd.Name), g.ObjectMeta.Name))
allErrs = append(allErrs, field.NotFound(field.NewPath("spec", "etcdClusters").Key(k).Key("name"), etcd.Name, "has no member for master instancegroup " + g.ObjectMeta.Name))

where k is the index from the range over EtcdClusters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the NotFound error is the other way around. It is the instancegroup's Name that isn't found in the etcdcluster's members. So it should be field.NewPath("objectMeta", "name")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this.

NotFound unfortunately doesn't take a third description argument, so I was looking hard at how to make this more clear.

If I understand the NotFound description correctly, the error's field should point to the missing value, not the "cause" of the missing value (which may not even be a path). Since the validation happens as a part of a lookup in the etcdSpec, I found this direction to be correct.

Question is how to make this clear to users. The most programatically correct solution would be field.NotFound(field.NewPath("spec", "etcdClusters").Index(k).Key("instanceGroup") but not sure that is the most understandable for users as it won't contain the etcd cluster name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the path to be more correct + added a deep validation test to see how this will look like to the end user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value that's missing is the instancegroup name.

I think we need to use field.Forbidden here. I believe field.NotFound is for when there is an obvious list of possible choices for the field's value. field.NotFound on a path inside spec.etcdClusters would be appropriate for a validation that checks if an etcdCluster Member's InstanceGroup field value was actually a defined InstanceGroup (possibly restricting to instancegroups with role master).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be reasonable for a five-master-IG cluster to have a Cilium etcd cluster with only three members? Perhaps not worth supporting that edge case, but maybe this restriction should only apply to the two etcd clusters used by apiserver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Forbidden is worth it just for the details string. I'll switch to that.

Having a different size for the cilium cluster could perhaps be interesting, but since we currently use the internal master DNS entry for this cluster, it won't work (or at least cilium will try to connect to members that don't exist). If someone really wants this, I can look into it later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johngmyers Which field would you say should be the target here?
There is a similar case when creating an IG with a subnet that doesn't exist. Here we use NotFound, but the path is much simpler.
With Forbidden, it makes more sense to point to spec itself, but not sure that makes that much sense. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field is field.NewPath("objectMeta", "name"), of the instancegroup. The detail would be something like instanceGroup "eu-central-1a" must have a member in etcdCluster "main".

@johngmyers
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 1, 2020
@rifelpet
Copy link
Member

rifelpet commented May 1, 2020

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: olemarkus, rifelpet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2020
@rifelpet
Copy link
Member

rifelpet commented May 1, 2020

/retest

@k8s-ci-robot k8s-ci-robot merged commit 07f9949 into kubernetes:master May 1, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone May 1, 2020
@olemarkus olemarkus deleted the change-zone branch May 10, 2020 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unable to change zones of control plane
4 participants