Autoscaling Elasticsearch: Introduce a dedicated custom resource #5978

barkbay · 2022-08-29T15:12:13Z

This PR introduces a dedicated Kubernetes Resource Definition, and an associated controller, to configure Elasticsearch autoscaling with ECK.

Naming and group

This PR introduces a new resource named ElasticsearchAutoscaler. It lives in a group named autoscaling.k8s.elastic.co:

apiVersion: autoscaling.k8s.elastic.co/v1alpha1
kind: ElasticsearchAutoscaler

This is mostly to let the door open to any additional autoscaling resources that we may want to add in the future.

ElasticsearchAutoscaler specification

The new CRD is very similar to the actual autoscaling annotation, with the obvious difference that a elasticsearchRef must be provided by the user:

apiVersion: autoscaling.k8s.elastic.co/v1alpha1
kind: ElasticsearchAutoscaler
metadata:
  name: autoscaling-sample
spec:
  elasticsearchRef:
    name: elasticsearch-sample
  policies:
    - name: di
      roles: ["data", "ingest" , "transform"]
      ## Optional: section below can be used if fine-grain tuning of the Elasticsearch deciders is required.
      #deciders:
      #  proactive_storage:
      #    forecast_window: 5m
      resources:
        nodeCount:
          min: 3
          max: 8
        cpu:
          min: 2
          max: 8
        memory:
          min: 2Gi
          max: 16Gi
        storage:
          min: 64Gi
          max: 512Gi
    - name: ml
      roles:
        - ml
      resources:
        nodeCount:
          min: 1
          max: 9
        cpu:
          min: 1
          max: 4
        memory:
          min: 2Gi
          max: 8Gi
        storage:
          min: 1Gi
          max: 1Gi

Only one cluster can be managed by a given ElasticsearchAutoscaler. This is similar to the K8S HorizontalPodAutoscaler and VerticalPodAutoscaler. It also makes it easier to understand the autoscaler status and the relationship between the autoscaler and the Elasticsearch cluster.

Status

The status consists of 2 main elements:

conditions provides an overall state of the reconciliation status.
policies holds the calculated resources and any error/important messages.

status:
  conditions:
  - lastTransitionTime: "2022-08-29T11:11:43Z"
    status: "False"
    type: Limited
  - lastTransitionTime: "2022-08-27T17:02:32Z"
    status: "True"
    type: Healthy
  - lastTransitionTime: "2022-08-27T17:01:41Z"
    status: "True"
    type: Active
  - lastTransitionTime: "2022-08-27T17:02:32Z"
    message: Elasticsearch is available
    status: "True"
    type: Online
  observedGeneration: 3
  policies:
  - lastModificationTime: "2022-08-29T11:19:07Z"
    name: di
    nodeSets:
    - name: di
      nodeCount: 7
    resources:
      limits:
        cpu: "2"
        memory: 8Gi
      requests:
        cpu: "2"
        memory: 8Gi
        storage: 5Gi
  - lastModificationTime: "2022-08-29T11:19:07Z"
    name: ml
    nodeSets:
    - name: ml
      nodeCount: 0
    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 2Gi
        storage: 1Gi

Printed columns

The conditions Limited, Active and Healthy are printed as part of the output of kubectl get elasticsearchautoscaler.autoscaling.k8s.elastic.co/<autoscaler_name>:

NAME                                                                    TARGET                 ACTIVE   HEALTHY   LIMITED
elasticsearchautoscaler.autoscaling.k8s.elastic.co/autoscaling-sample   elasticsearch-sample   True     True      True

TARGET: the name of Elasticsearch cluster which is autoscaled
ACTIVE: True when the ElasticsearchAutoscaler resource is managed by the operator, and the target Elasticsearch cluster does exist.
HEALTHY: True if resources have been calculated for all the autoscaling policies and no error has been encountered during the reconciliation process.
LIMITED: True when a resource limit is reached

For each printed condition there is an additional message in the related condition.

Noteworthy change

By default there is now a ratio of 1:1 between the CPU limit and the request. This is to comply with the current desired nodes API implementation.

Testing

Elasticsearch autoscaling is an Enterprise feature, both at the ECK and the Elasticsearch level, in dev mode you can start a trial.

Autoscaling events can be generated by using the fixed decider, this is what is done in the e2e test for an example:

cloud-on-k8s/test/e2e/es/autoscaling_test.go

Lines 89 to 90 in 79ef072

    
           // Use the fixed decider to trigger a scale up of the data tier up to its max memory limit and 3 nodes. 
        
           esaScaleUpStorageBuilder := autoscalingBuilder.DeepCopy().WithFixedDecider("data-ingest", map[string]string{"storage": "19gb", "nodes": "3"})

TODO:

Add unit tests for the new controller
Add webhook validation
Add a new E2E Test
Update documentation, can be done in a dedicated PR: [DOC] Autoscaling custom resource #5997

barkbay · 2022-08-31T12:11:33Z

Last commit enables advanced validation:

Either by using the admission controller:

for: "config/recipes/autoscaling/elasticsearch.yaml": admission webhook "elastic-esa-validation-v1alpha1.k8s.elastic.co" denied the request:
ElasticsearchAutoscaler.autoscaling.k8s.elastic.co "autoscaling-sample" is invalid:
spec.policies[0].resources.nodeCount.min: Invalid value: -1: min count must be equal or greater than 0

Or at the conditions level if the webhook is disabled:

status:
  conditions:
  - lastTransitionTime: "2022-08-31T09:58:03Z"
    message: Autoscaler is unhealthy
    status: "True"
    type: Active
  - lastTransitionTime: "2022-08-31T09:58:03Z"
    message: 'ElasticsearchAutoscaler.autoscaling.k8s.elastic.co "autoscaling-sample"
      is invalid: spec.policies[0].resources.nodeCount.min: Invalid value: -1: min
      count must be equal or greater than 0'
    status: "False"
    type: Healthy
  - lastTransitionTime: "2022-08-31T09:58:03Z"
    message: Autoscaler is unhealthy
    status: "False"
    type: Online
  - lastTransitionTime: "2022-08-30T11:53:45Z"
    status: "False"
    type: Limited
  observedGeneration: 2

Note that the custom resource is considered as "unhealthy" in that case:

NAME                                                                    TARGET                 ACTIVE   HEALTHY   LIMITED
elasticsearchautoscaler.autoscaling.k8s.elastic.co/autoscaling-sample   elasticsearch-sample   True     False     False

barkbay · 2022-09-12T06:05:21Z

@barkbay the initial part that caught me when beginning to look at this, is that the cpu validation seems to differ from standard k8s resources.*.cpu validation.

: Invalid value: "": "spec.policies.resources.cpu.min" must validate at least one schema (anyOf)

spec.policies.resources.cpu.min: Invalid value: "number": spec.policies.resources.cpu.min in body must be of type integer: "number"

Is this intentional?

Unless I'm missing something there is nothing specific to the generation of the OpenAPIV3 schema of Quantity values in this PR.
Note that my IDE expects ~~either an integer or~~ a string (using the m suffix):

With the m suffix:

With quotes:

barkbay · 2022-09-12T06:51:18Z

@naemono see also:

This validation which prevents the use of float in custom resources: https://github.com/kubernetes-sigs/controller-tools/blob/40db49591af1bf9c336c7c3cefb417af618df402/pkg/crd/schema.go#L433-L438
This documentation, "Specify a CPU request and a CPU limit", which clearly uses quotes around the quantity value:

apiVersion: v1
kind: Pod
metadata:
  name: cpu-demo
  namespace: cpu-example
spec:
  containers:
  - name: cpu-demo-ctr
    image: vish/stress
    resources:
      limits:
        cpu: "1"
      requests:
        cpu: "0.5"
    args:
    - -cpus
    - "2"

I don't understand how the example you mentioned is not rejected by the API server tbh.

naemono · 2022-09-12T13:28:21Z

I don't understand how the example you mentioned is not rejected by the API server tbh.

I think I'm simply assuming that the Elasticsearch.spec.nodeSets[].podTemplate.spec.containers[].resources follows the same validation rules as Pod.spec.containers[].resources. Strange, when I apply the following, it works without failure, even without quotes.

apiVersion: v1
kind: Pod
metadata:
  name: cpu-demo
spec:
  containers:
  - name: cpu-demo-ctr
    image: vish/stress
    resources:
      limits:
        cpu: 1.0
      requests:
        cpu: 0.5
    args:
    - -cpus
    - "2"

It's probably not worth digging into to find why the validation is slightly different, but it certainly seems to be.

pebrc

I tried to look through the source code and I it looks really good, for me it is ready to merge. I only found a few nits here and there and one incorrect usage of the log API. I also ran some tests and found some issues. However I am not sure if they are related to this PR or whether we have the same already with annotation based autoscaling.

What I am seeing is that the Elasticsearch resource never fully reconciles (or at least only briefly). This is because the desired nodes API integration throws errors like the following:

2022-09-19T20:45:16.341+0200	ERROR	manager.eck-operator	Reconciler error	{"service.version": "2.5.0-SNAPSHOT+8831df33", "controller": "elasticsearch-controller", "object": {"name":"es","namespace":"default"}, "namespace": "default", "name": "es", "reconcileID": "896949ed-99d7-4f71-a02f-c839d853a9aa", "error": "elasticsearch client failed for https://es-es-internal-http.default.svc:9200/_internal/desired_nodes/651bb9ea-360b-462f-87a1-fe5415d78ac5/8?error_trace=true: 400 Bad Request: {Status:400 Error:{CausedBy:{Reason: Type:} Reason:Desired nodes with history [651bb9ea-360b-462f-87a1-fe5415d78ac5] and version [8] already exists with a different definition Type:

It seems that the same generation of the ES resource is reconciled multiple times (expected I would say) but that the desired nodes differ within one generation. ~~I have not dug deeper yet why exactly this is~~ It seems likely that this is related to changing resources for the autoscaled node sets (the part that confuses me here is that if the resources change I would also expect the generation to change)

<       "storage": "3221225472b",
---
>       "storage": "2147483648b",

I think this is because we take the desired storage form the volume claim which are being resized one by one while the Elasticsearch resource is not changing.

Because the nodes reconciliation has already happened at this point the autoscaling itself is not negatively affected and the cluster keeps scaling correctly. But the Elasticsearch status is stuck in applying changes and the log is full of the reconciler errors.

pkg/controller/autoscaling/elasticsearch/controller.go

pkg/controller/autoscaling/elasticsearch/autoscaler/autoscaler_test.go

pebrc · 2022-09-19T17:02:03Z

pkg/utils/stringsutil/strings.go

+	truncated := ""
+	count := 0
+	for _, char := range s {
+		truncated += string(char)


Nit: could you not just re-slice the underlying byte array once your have reached n instead of concatenating the runes

Let me know if this is what you had in mind: fe77306

pkg/telemetry/telemetry.go

pkg/apis/autoscaling/v1alpha1/elasticsearch_types.go

pkg/controller/autoscaling/elasticsearch/controller.go

pkg/controller/autoscaling/elasticsearch/validation/webhook.go

pebrc · 2022-09-20T12:21:25Z

pkg/controller/autoscaling/elasticsearch/validation/webhook_test.go

+					},
+				},
+				checker: yesCheck,
+			},


Missing the wantValidationError here?

We actually support the case where the user is not relying on the default volume claim. I added a comment in the unit test here.

pkg/controller/common/autoscaling/association.go

barkbay · 2022-09-21T07:01:39Z

This is because the desired nodes API integration throws errors like the following:
[...]
But the Elasticsearch status is stuck in applying changes and the log is full of the reconciler errors.

I hit a similar issue in #5979 I think we need to revisit the way we use the desired nodes API for the next release 😕

Makefile

pkg/apis/common/v1alpha1/resources.go

thbkrkr

Very nice work!

Nit: almost all k8s resources displayed via kubectl have an "AGE" column, not this one. No big deal but my eyes aren't used to it.

I spotted a small limitation I think, we can't create an autoscaler for an ES that has 1 nodeSet without explicit `node.roles, which seems acceptable to me.

  nodeSets:
  - name: default
    count: 3
    config:
      node.store.allow_mmap: false

test/e2e/test/elasticsearch/autoscaling/builder.go

pkg/controller/autoscaling/elasticsearch.go

pkg/controller/autoscaling/elasticsearch/policy.go

pkg/controller/autoscaling/elasticsearch/reconcile.go

pkg/apis/elasticsearch/v1/status.go

pebrc

LGTM!

barkbay · 2022-09-22T07:09:19Z

@naemono I think I addressed your comments, please let me know if I missed something 🙇

naemono

This looks great. Nice work @barkbay

pkg/apis/common/v1alpha1/autoscaling_status.go

Follow up of #5978 which has been merged into main with the wrong controller tools version.

…stic#5978) This (huge) commit introduces a dedicated Kubernetes Resource Definition, and an associated controller, to configure Elasticsearch autoscaling with ECK.

Follow up of elastic#5978 which has been merged into main with the wrong controller tools version.

barkbay added 13 commits August 26, 2022 08:26

Introduce Elasticsearch autoscaling CRD

8471a45

Introduce AutoscalingResource interface

ba87dca

Update validation

86131dc

Update status when inactive

dfa6e67

Elasticsearch cluster should be HA

0aa1d25

Update K8S ClusterRoles

3a8c3e1

Merge remote-tracking branch 'origin/main' into autoscaling/crd

03eed36

Error management

2f9193d

Add unit tests for status

4ef5a39

Set both limits and request to use the desired nodes API

4e11f5d

Add e2e RBACs

f81aac3

Add autoscaling CRD in OperatorHub manifest

b52962c

Update godoc

81ede3f

barkbay added >feature Adds or discusses adding a feature to the product autoscaling v2.5.0 labels Aug 29, 2022

barkbay marked this pull request as draft August 29, 2022 15:12

barkbay added 2 commits August 30, 2022 09:45

Update existing e2e test

144c8dc

Clean up generated resources and API documentation

5969845

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

Add (webhook) validation

24e8f4c

barkbay mentioned this pull request Aug 31, 2022

[DOC] Update github.com/elastic/crd-ref-docs to v0.0.8 #5984

Merged

barkbay added 3 commits September 1, 2022 07:59

Merge remote-tracking branch 'origin/main' into autoscaling/crd

60ee43f

Update API docs

bb8fa38

Add E2E test

fb4f91a

barkbay force-pushed the autoscaling/crd branch from 6c6c958 to fb4f91a Compare September 1, 2022 12:03

Add unit tests for new controller

2a2256c

barkbay marked this pull request as ready for review September 5, 2022 06:36

Generate CRDs

8831df3

pebrc self-assigned this Sep 19, 2022

pebrc reviewed Sep 20, 2022

View reviewed changes

barkbay added 2 commits September 21, 2022 07:25

Fix typos and update comments

bb13cf3

Merge remote-tracking branch 'origin/main' into autoscaling/crd

0f1cb2f

barkbay requested a review from naemono September 21, 2022 05:28

barkbay added 2 commits September 21, 2022 07:32

Regenerate CRDs and update condition message

36083b6

Update some comments

285485c

barkbay added 3 commits September 21, 2022 09:43

Update telemetry

b5b5db9

Refactor truncate

fe77306

make generate-api-docs

dd6a0be

thbkrkr reviewed Sep 21, 2022

View reviewed changes

Makefile Outdated Show resolved Hide resolved

pkg/apis/common/v1alpha1/resources.go Outdated Show resolved Hide resolved

barkbay added 2 commits September 21, 2022 13:13

Move EnterpriseFeaturesEnabled check

3ad5bc5

Update from review

79e5f8c

pebrc mentioned this pull request Sep 21, 2022

Desired nodes API errors on volume resizes #6027

Closed

thbkrkr reviewed Sep 21, 2022

View reviewed changes

pebrc approved these changes Sep 21, 2022

View reviewed changes

Typo and imports

35f6fba

naemono approved these changes Sep 22, 2022

View reviewed changes

pkg/apis/common/v1alpha1/autoscaling_status.go Outdated Show resolved Hide resolved

barkbay merged commit e7bd34c into elastic:main Sep 23, 2022

This was referenced Sep 23, 2022

Remove the autoscaling annotation #6035

Closed

Update controller-gen.kubebuilder.io/version to v0.10.0 #6036

Merged

barkbay added a commit that referenced this pull request Sep 23, 2022

Update controller-gen.kubebuilder.io/version to v0.10.0 (#6036)

deffd25

Follow up of #5978 which has been merged into main with the wrong controller tools version.

fantapsody pushed a commit to fantapsody/cloud-on-k8s that referenced this pull request Feb 7, 2023

Update controller-gen.kubebuilder.io/version to v0.10.0 (elastic#6036)

7a845a4

Follow up of elastic#5978 which has been merged into main with the wrong controller tools version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoscaling Elasticsearch: Introduce a dedicated custom resource #5978

Autoscaling Elasticsearch: Introduce a dedicated custom resource #5978

barkbay commented Aug 29, 2022 •

edited

Loading

This comment was marked as outdated.

This comment was marked as resolved.

barkbay commented Aug 31, 2022

barkbay commented Sep 12, 2022 •

edited

Loading

barkbay commented Sep 12, 2022 •

edited

Loading

naemono commented Sep 12, 2022

pebrc left a comment •

edited

Loading

pebrc Sep 19, 2022

barkbay Sep 21, 2022

pebrc Sep 20, 2022

barkbay Sep 21, 2022

barkbay commented Sep 21, 2022

thbkrkr left a comment

pebrc left a comment

barkbay commented Sep 22, 2022

naemono left a comment

	// Use the fixed decider to trigger a scale up of the data tier up to its max memory limit and 3 nodes.
	esaScaleUpStorageBuilder := autoscalingBuilder.DeepCopy().WithFixedDecider("data-ingest", map[string]string{"storage": "19gb", "nodes": "3"})

Autoscaling Elasticsearch: Introduce a dedicated custom resource #5978

Autoscaling Elasticsearch: Introduce a dedicated custom resource #5978

Conversation

barkbay commented Aug 29, 2022 • edited Loading

Naming and group

ElasticsearchAutoscaler specification

Status

Printed columns

Noteworthy change

Testing

This comment was marked as outdated.

This comment was marked as resolved.

barkbay commented Aug 31, 2022

barkbay commented Sep 12, 2022 • edited Loading

barkbay commented Sep 12, 2022 • edited Loading

naemono commented Sep 12, 2022

pebrc left a comment • edited Loading

Choose a reason for hiding this comment

pebrc Sep 19, 2022

Choose a reason for hiding this comment

barkbay Sep 21, 2022

Choose a reason for hiding this comment

pebrc Sep 20, 2022

Choose a reason for hiding this comment

barkbay Sep 21, 2022

Choose a reason for hiding this comment

barkbay commented Sep 21, 2022

thbkrkr left a comment

Choose a reason for hiding this comment

pebrc left a comment

Choose a reason for hiding this comment

barkbay commented Sep 22, 2022

naemono left a comment

Choose a reason for hiding this comment

barkbay commented Aug 29, 2022 •

edited

Loading

barkbay commented Sep 12, 2022 •

edited

Loading

barkbay commented Sep 12, 2022 •

edited

Loading

pebrc left a comment •

edited

Loading