Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate Elasticsearch resource names #1647

Merged
merged 7 commits into from
Sep 4, 2019

Conversation

charith-elastic
Copy link
Contributor

Adds checks to ensure that the resource names generated by the operator for a given Elasticsearch object will be valid.

Fixes #1474

@charith-elastic charith-elastic added >enhancement Enhancement of existing functionality v1.0.0-beta1 labels Aug 28, 2019
}

// length of the ordinal suffix that will be added to the pods of this sset
podOrdinalSuffixLen := len(strconv.FormatInt(int64(nodeSpec.NodeCount), 10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this be nodeSpec.NodeCount - 1? We start numbering from 0, so for 100 nodes the last ordinal will be 99.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentional to account for the dash before the ordinal but now that I think about it, it doesn't work for node counts below 10.

// length of the ordinal suffix that will be added to the pods of this sset
podOrdinalSuffixLen := len(strconv.FormatInt(int64(nodeSpec.NodeCount), 10))
// there should be enough space for the ordinal suffix
if validation.DNS1123SubdomainMaxLength-len(ssetName) < podOrdinalSuffixLen {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this account for dash before the ordinal?

func truncate(s string, length int) string {
var b strings.Builder
for _, r := range s {
if b.Len()+utf8.RuneLen(r) > length {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect anything other than single byte characters? Should we validate that somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernetes' own validation checks will not let multi-byte names to be submitted at the moment but I prefer not to rely on that assumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Should we check it in the Validate function then? It seems here we truncate based on (potential) multibyte length, but other than that we allow it and rely on k8s own validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to provide some context here, the validation webhook simply transitions the object to the INVALID state when the custom validation checks fail. Trying to delete the object causes a problem because the finalizers start running and then start to trip over the invalid names -- making it impossible to delete. This is why I introduced the truncate function so that the object can be deleted even when it is invalid. It should never be used in the happy path and therefore whether the string is multi-byte or not does not really matter.

@@ -70,6 +70,9 @@ func Validate(es v1alpha1.Elasticsearch) error {

// length of the ordinal suffix that will be added to the pods of this sset
podOrdinalSuffixLen := len(strconv.FormatInt(int64(nodeSpec.NodeCount), 10))
if nodeSpec.NodeCount < 10 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I follow why this is needed. I'd think you always need to account for dash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I made the mistake of thinking in round numbers only.

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to do something about

type NodeSpec struct {
// Name is a logical name for this set of nodes. Used as a part of the managed Elasticsearch node.name setting.
// +kubebuilder:validation:Pattern=[a-zA-Z0-9-]+
// +kubebuilder:validation:MaxLength=19
// TODO: refactor and explain name length conventions
Name string `json:"name"`

@charith-elastic
Copy link
Contributor Author

Max Kubernetes resource name length = 253
Max Elasticsearch name length = 36

If we allow for a maximum node count of 1000, then the ordinal suffix length for pods will be at most 4 (-999) so node spec name length max could be 253-36-4-1 = 212.

However, given that we restrict Elasticsearch name length to 36, should we also just set the node spec name length to 36 for consistency as well? @sebgl

@sebgl
Copy link
Contributor

sebgl commented Aug 29, 2019

The biggest constraint we have is label length, limited to 63 characters.
We do use, for example, StatefulSet names (cluster name-node spec name-ordinal) as labels on pods.
The max Elasticsearch length of 36 is something we computed based on that, reverting backward from 63 using the maximum suffix we added at the time (pod name based on cluster name-longest secret name suffix).
The current NodeSpec length of 19 was also somewhat derived from that, but should be updated to something that makes more sense.

What's hard is to either:

  1. pick a fixed max ES cluster name + a fixed max NodeSpec name for which both appended fit into 63 characters (with dashes)
  2. consider both length are elastic (pun not intended 😄) so the sum of both is not larger than 63, but don't restrict a fixed length for any of those

I guess an in-between approach where we rely on 2, but still pick some fixed max length for both Elasticsearch and the NodeSpec (even though it may lead to an incompatible length that we still reject) may be ok. The values of 36 and 19 are probably nonsense now and should be updated.

@charith-elastic
Copy link
Contributor Author

(2) is what I wanted to get at as well but it's more difficult than I thought because name generation is not centralised (I missed the labelling part for example). Looks like I'll have to refactor the naming bits a little bit more to make this work.

@sebgl
Copy link
Contributor

sebgl commented Aug 29, 2019

I'm wondering whether it would be that bad to just come up with fixed numbers. For example something like: cluster name is limited to 36 characters, node spec name is limited to 21 characters.
Benefit would be that we don't actually have to check anything: name restrictions would be enforced in the API schema directly by Kubernetes. We've seen some people disable our validation webhook for example, making things fail at runtime instead of failing at ES resource creation/update time.

I can imagine some users may want more flexibility (larger cluster names, smaller node spec names), not sure how big of a use case that is.
(I don't have strong opinions on this, just want to make sure we do consider it).

@charith-elastic
Copy link
Contributor Author

Summarised my findings in the original issue itself: #1474 (comment)

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very comprehensive to me! 👍

Copy link
Contributor

@thbkrkr thbkrkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I have just a very small reservation about the relevance of adding e2e tests just for that. rejectionOfLongName makes sense. longestPossibleName could be done using another test if at some point we want to save on the number of e2e tests.

@charith-elastic
Copy link
Contributor Author

@thbkrkr because naming functions are not centralised, it is difficult to be 100% sure that the validation function covers all possible suffixes that could be added to a name. The longestPossibleName test is there to make sure that we catch any cases that the validation function is not aware of.

@charith-elastic charith-elastic merged commit 576f07f into elastic:master Sep 4, 2019
@charith-elastic charith-elastic deleted the naming-validation branch September 4, 2019 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality v1.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setup corrected ES naming length conventions
5 participants