Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,6 @@ products:

# Advanced configuration for Elastic Maps Server on {{eck}} [k8s-maps-advanced-configuration]

::::{warning}
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
::::


If you already looked at the [{{es}} on ECK](elasticsearch-configuration.md) documentation, some of these concepts might sound familiar to you. The resource definitions in ECK share the same philosophy when you want to:

* Customize the Pod configuration
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ products:
* [Define {{es}} nodes roles](#k8s-define-elasticsearch-nodes-roles)
* [Pod affinity and anti-affinity](#k8s-affinity-options)
* [Topology spread constraints and availability zone awareness](#k8s-availability-zone-awareness)
* [Zone awareness using the `zoneAwareness` field](#k8s-zone-awareness) {applies_to}`eck: ga 3.4`
* [Manual zone awareness configuration](#k8s-zone-awareness-manual)
* [Hot-warm topologies](#k8s-hot-warm-topologies)

You can combine these features to deploy a production-grade {{es}} cluster.
Expand Down Expand Up @@ -200,9 +202,127 @@ This example restricts {{es}} nodes so they are only scheduled on Kubernetes hos

## Topology spread constraints and availability zone awareness [k8s-availability-zone-awareness]

Starting with ECK 2.0 the operator can make Kubernetes Node labels available as Pod annotations. It can be used to make information, such as logical failure domains, available in a running Pod. Combined with [{{es}} shard allocation awareness](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) and [Kubernetes topology spread constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/), you can create an availability zone-aware {{es}} cluster.
Distributing {{es}} nodes and shard replicas across failure domains (typically cloud availability zones) is a fundamental requirement for production clusters. ECK provides built-in zone awareness support and also allows manual configuration for advanced use cases.

### Zone awareness using the `zoneAwareness` field [k8s-zone-awareness]

{applies_to}`eck: ga 3.4`

The `zoneAwareness` field on NodeSets is the recommended way to set up availability zone awareness. Instead of manually configuring topology spread constraints, downward node labels, environment variables, and {{es}} allocation awareness settings yourself, you add a single `zoneAwareness` field and ECK handles the rest.

When `zoneAwareness` is set on a NodeSet, the operator automatically:

* Injects a `TopologySpreadConstraint` with `maxSkew: 1` and `whenUnsatisfiable: DoNotSchedule` to evenly spread pods across zones.
* Exposes the Kubernetes node's zone as a `ZONE` environment variable inside each pod using [downward node labels](#k8s-availability-zone-awareness-downward-api).
* Sets `node.attr.zone` and `cluster.routing.allocation.awareness.attributes: k8s_node_name,zone` in the {{es}} configuration.
* Injects a required node affinity rule ensuring that the topology key label exists on the node, so pods are only placed on nodes that carry the label.
* When `zones` are specified, additionally restricts pod placement to those specific zones.

#### Minimal zone awareness

To spread pods across all available zones using defaults:

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 6
zoneAwareness: {}
```

#### Zone awareness with explicit zones

To restrict pods to specific availability zones:

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: {{version.stack}}
nodeSets:
- name: hot
count: 6
config:
node.roles: ["data_hot"]
zoneAwareness:
zones:
- us-east1-a
- us-east1-b
- us-east1-c
```

#### Zone awareness with custom topology key

By default, `zoneAwareness` uses the `topology.kubernetes.io/zone` node label. To use a different label:

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 6
zoneAwareness:
topologyKey: my.custom/zone-label
```

#### Customizing spread behavior

To customize `maxSkew`, `whenUnsatisfiable`, or other topology spread constraint fields, provide a `topologySpreadConstraint` for the same `topologyKey` in the `podTemplate`. The operator preserves user-provided constraints and does not inject its own default for that key:

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 6
zoneAwareness:
topologyKey: topology.kubernetes.io/zone
podTemplate:
spec:
topologySpreadConstraints:
- topologyKey: topology.kubernetes.io/zone
maxSkew: 3
whenUnsatisfiable: ScheduleAnyway
```

#### Mixed NodeSets with and without zone awareness

Enable `zoneAwareness` on **all** NodeSets in a cluster for the best results. If some NodeSets are accidentally left without `zoneAwareness`, the operator applies safeguards to keep the cluster consistent:

* All NodeSets in the cluster still receive the `ZONE` environment variable and {{es}} zone configuration (`node.attr.zone`, `cluster.routing.allocation.awareness.attributes`) so that shard allocation awareness works cluster-wide.
* Non-zoneAware NodeSets receive a required node affinity ensuring that the topology key exists for nodes that carry the topology label, but they do not receive topology spread constraints and might not be evenly distributed across zones.

::::{important}
Adding `zoneAwareness` to any NodeSet triggers a one-time rolling restart of **all** NodeSets in the cluster, because zone-related {{es}} configuration and environment variables are applied cluster-wide. To avoid unnecessary restarts, enable `zoneAwareness` on every NodeSet at the same time.
::::

#### Validation rules

* All zone-aware NodeSets in a cluster must use the same `topologyKey`.
* When using the default topology key (`topology.kubernetes.io/zone`), the operator allows it automatically if the `--exposed-node-labels` flag is unset or empty. If `--exposed-node-labels` is explicitly set to a non-empty value, the default topology key must also be included in the allowed list. Custom topology keys must always be allowed by the operator’s `--exposed-node-labels` configuration.
* The `zones` list, when specified, must contain at least one entry and no duplicates.

### Manual zone awareness configuration [k8s-zone-awareness-manual]

For ECK versions before 3.4.0, or for advanced use cases not covered by the `zoneAwareness` field, you can manually configure availability zone awareness. The following section describes how to manually configure availability zone awareness.

#### Exposing Kubernetes node topology labels in Pods [k8s-availability-zone-awareness-downward-api]

### Exposing Kubernetes node topology labels in Pods [k8s-availability-zone-awareness-downward-api]
:::{note}
Starting with Kubernetes 1.35 and later, the `PodTopologyLabelsAdmission` feature is enabled by default. As a result, the labels `topology.kubernetes.io/region` and `topology.kubernetes.io/zone` from the node are automatically propagated as labels on Pods. This means that you can skip using the `eck.k8s.elastic.co/downward-node-labels` annotation and avoid making additional configuration changes to expose these topology labels in your Pods. In this situation, you can skip the first two steps described below. Additionally, in this scenario, node labels appear as Pod labels rather than annotations.
:::
Expand All @@ -214,11 +334,11 @@ Starting with Kubernetes 1.35 and later, the `PodTopologyLabelsAdmission` featur
Refer to the next section or to the [{{es}} sample resource in the ECK source repository](https://github.com/elastic/cloud-on-k8s/tree/{{version.eck | M.M}}/config/samples/elasticsearch/elasticsearch.yaml) for a complete example.


### Using node topology labels, Kubernetes topology spread constraints, and {{es}} shard allocation awareness [k8s-availability-zone-awareness-example]
#### Using node topology labels, Kubernetes topology spread constraints, and {{es}} shard allocation awareness [k8s-availability-zone-awareness-example]

The following example demonstrates how to use the `topology.kubernetes.io/zone` node labels to spread a NodeSet across the availability zones of a Kubernetes cluster.

Note that by default ECK creates a `k8s_node_name` attribute with the name of the Kubernetes node running the Pod, and configures {{es}} to use this attribute. This ensures that {{es}} allocates primary and replica shards to Pods running on different Kubernetes nodes and never to Pods that are scheduled onto the same Kubernetes node. To preserve this behavior while making {{es}} aware of the availability zone, include the `k8s_node_name` attribute in the comma-separated `cluster.routing.allocation.awareness.attributes` list.
By default ECK creates a `k8s_node_name` attribute with the name of the Kubernetes node running the Pod, and configures {{es}} to use this attribute. This ensures that {{es}} allocates primary and replica shards to Pods running on different Kubernetes nodes and never to Pods that are scheduled onto the same Kubernetes node. To preserve this behavior while making {{es}} aware of the availability zone, include the `k8s_node_name` attribute in the comma-separated `cluster.routing.allocation.awareness.attributes` list.

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
Expand Down Expand Up @@ -349,7 +469,6 @@ In this example, we configure two groups of {{es}} nodes:
This example uses [Local Persistent Volumes](https://kubernetes.io/docs/concepts/storage/volumes/#local) for both groups, but can be adapted to use high-performance volumes for `hot` {{es}} nodes and high-storage volumes for `warm` {{es}} nodes.
::::


Finally, set up [Index Lifecycle Management](/manage-data/lifecycle/index-lifecycle-management.md) policies on your indices, [optimizing for hot-warm architectures](https://www.elastic.co/blog/implementing-hot-warm-cold-in-elasticsearch-with-index-lifecycle-management).


Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,7 @@

# Deploy Elastic Maps Server [k8s-maps-es]

::::{warning}
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
::::


Deploying Elastic Maps Server can be done with a simple manifest:

Check notice on line 13 in deploy-manage/deploy/cloud-on-k8s/deploy-elastic-maps-server.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.WordChoice: Consider using 'efficient, basic' instead of 'simple', unless the term is in the UI.

```yaml subs=true
apiVersion: maps.k8s.elastic.co/v1alpha1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,19 @@
kubectl patch sts elastic-operator -n elastic-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"manager", "image":"docker.elastic.co/eck/eck-operator-fips:${ECK_VERSION}"}]}}}}'
```

## Operator-managed {{es}} FIPS keystore password [k8s-fips-keystore-password]

When FIPS mode is enabled in {{es}} (`xpack.security.fips_mode.enabled: true`), {{es}} requires a password-protected keystore. Starting with ECK 3.4.0 and {{es}} 9.4.0+, the operator automatically manages this for you by generating, storing, and configuring the {{es}} keystore password — eliminating the need for manual `podTemplate` overrides.

The operator creates a Secret named `<cluster-name>-es-keystore-password` containing the generated password and mounts it into the {{es}} pods. The keystore init container uses this password to create a password-protected keystore.

This feature activates automatically when all of the following conditions are met:

Check notice on line 60 in deploy-manage/deploy/cloud-on-k8s/deploy-fips-compatible-version-of-eck.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Wordiness: Consider using 'all' instead of 'all of '.

* `xpack.security.fips_mode.enabled: true` is set in any NodeSet config or via a StackConfigPolicy

Check warning on line 62 in deploy-manage/deploy/cloud-on-k8s/deploy-fips-compatible-version-of-eck.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Latinisms: Latin terms and abbreviations are a common source of confusion. Use 'using' instead of 'via'.
* {{es}} version is 9.4.0 or later
* No user-provided keystore password is detected

If you have already configured a keystore password through environment variables (`KEYSTORE_PASSWORD`, `KEYSTORE_PASSWORD_FILE`, or `ES_KEYSTORE_PASSPHRASE_FILE`) in the `podTemplate`, the operator respects your configuration and does not generate its own.

When FIPS mode is disabled or the {{es}} version is downgraded below 9.4.0, the operator automatically cleans up the managed keystore password Secret.

Check notice on line 68 in deploy-manage/deploy/cloud-on-k8s/deploy-fips-compatible-version-of-eck.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.WordChoice: Consider using 'deactivated, deselected, hidden, turned off, unavailable' instead of 'disabled', unless the term is in the UI.

5 changes: 0 additions & 5 deletions deploy-manage/deploy/cloud-on-k8s/elastic-maps-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,6 @@ products:

# Elastic Maps Server [k8s-maps]

::::{warning}
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
::::


If you cannot connect to Elastic Maps Service from the {{kib}} server or browser clients, and you are running ECK with an Enterprise license, you can opt to host the service on your Kubernetes cluster. Check also the [Elastic Maps Server documentation.](/explore-analyze/visualize/maps/maps-connect-to-ems.md#elastic-maps-server)

The following sections describe how to customize an Elastic Maps Server deployment to suit your requirements.
Expand Down
5 changes: 0 additions & 5 deletions deploy-manage/deploy/cloud-on-k8s/http-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,6 @@ products:

# Elastic Maps HTTP configuration [k8s-maps-http-configuration]

::::{warning}
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
::::


## Load balancer settings and TLS SANs [k8s-maps-http-publish]

By default a `ClusterIP` [service](https://kubernetes.io/docs/concepts/services-networking/service/) is created and associated with the Elastic Maps Server deployment. If you want to expose maps externally with a [load balancer](https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer), it is recommended to include a custom DNS name or IP in the self-generated certificate.
Expand Down
5 changes: 0 additions & 5 deletions deploy-manage/deploy/cloud-on-k8s/map-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,6 @@ products:

# Map data [k8s-maps-data]

::::{warning}
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
::::


The Elastic Maps Server Docker image contains only a few zoom levels of data. To get the map data up to the highest zoom level, Elastic Maps Server needs a basemap file mounted into its container.

## Basemap download [k8s-maps-basemap-download]
Expand Down
80 changes: 80 additions & 0 deletions deploy-manage/deploy/cloud-on-k8s/nodes-orchestration.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ This section covers the following topics:
* [StatefulSets orchestration](#k8s-statefulsets)
* [Limitations](#k8s-orchestration-limitations)
* [Advanced control during rolling upgrades](#k8s-advanced-upgrade-control)
* [Cluster Rolling Restart](#cluster-rolling-restart)
* [Restart allocation delay](#restart-allocation-delay)

## NodeSets overview [k8s-nodesets]

Expand Down Expand Up @@ -195,3 +197,81 @@ For a complete list of available predicates, their meaning, and example usage, r
* These predicates might change in the future. We will be adding, removing, and renaming these over time, so be careful in adding these to any automation.
* Also, make sure you remove them after use by running `kublectl annotate elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-sample eck.k8s.elastic.co/disable-upgrade-predicates-`
::::

## Cluster Rolling Restart [cluster-rolling-restart]

```{applies_to}
eck: ga 3.4.0
```

You can trigger a graceful rolling restart of an {{es}} cluster without changing the cluster spec (version, image, or pod template). The operator reuses the same rolling upgrade path: it uses the {{es}} node shutdown API, respects the same [ECK upgrade predicates](cloud-on-k8s://reference/upgrade-predicates.md), and restarts one node at a time.

### Annotations

Set these annotations on the `Elasticsearch` resource metadata:

| Annotation | Description |
|------------|-------------|
| `eck.k8s.elastic.co/restart-trigger` | **Required to trigger.** Set or change this value (for example to a timestamp) to start a rolling restart. The value is propagated to pod annotations and is visible in the {{es}} node shutdown API response as the shutdown reason. |

You can also set the [`eck.k8s.elastic.co/restart-allocation-delay`](#restart-allocation-delay) annotation to control the shard allocation delay during the restart.

To trigger another rolling restart later, update the `restart-trigger` value (for example to a new timestamp). Removing the annotation does **not** trigger a new restart; the operator retains the last trigger value on the pod template. Removing the annotation also does **not** cancel an in-progress rolling restart—pods not yet restarted will still be restarted with the previous trigger value. The operator may emit an admission webhook warning (non-blocking) when you remove the annotation. Re-applying the same `restart-trigger` value (for example after removing it and setting it again) may not trigger a new rolling restart if all pods already have that value; the operator may emit an admission webhook warning when the value is unchanged.

### Example

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: my-cluster
annotations:
eck.k8s.elastic.co/restart-trigger: "2026-01-14T12:00:00Z"
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 3
config:
node.roles: ["master", "data", "ingest", "ml"]
node.store.allow_mmap: false
```

Progress is visible in the {{es}} resource status under **In Progress Operations** → **Upgrade**, with node-level messages such as "Deleting pod for rolling restart".

## Restart allocation delay [restart-allocation-delay]

```{applies_to}
eck: ga 3.4.0
```

The `eck.k8s.elastic.co/restart-allocation-delay` annotation controls the `allocation_delay` parameter passed to the {{es}} node shutdown API when nodes are taken offline. Any value set on this annotation is used during both **upgrades** and **custom triggered [rolling restarts](#cluster-rolling-restart)**.

Set this annotation on the `Elasticsearch` resource metadata:

| Annotation | Description |
|------------|-------------|
| `eck.k8s.elastic.co/restart-allocation-delay` | Optional. A duration string (for example `"5m"`, `"20m"`) that tells {{es}} how long to wait before reallocating shards from a node that is shutting down. If unset, the {{es}} default is used. Invalid or negative values are logged and ignored. |

By default, when a node begins shutting down, {{es}} waits a short period before it starts moving shards to other nodes. Setting a longer `allocation_delay` avoids unnecessary shard movements during planned restarts where the node is expected to return quickly. Setting a shorter value causes {{es}} to start rebalancing sooner, which can be useful if the restart is expected to take a long time.

### Example

```yaml subs=true
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: my-cluster
annotations:
eck.k8s.elastic.co/restart-allocation-delay: "20m"
spec:
version: {{version.stack}}
nodeSets:
- name: default
count: 3
config:
node.roles: ["master", "data", "ingest", "ml"]
node.store.allow_mmap: false
```

In this example the 20-minute delay applies whenever ECK restarts a node, whether the restart is part of a version upgrade, a spec change, or a manually triggered rolling restart initiated by `eck.k8s.elastic.co/restart-trigger`.
Loading
Loading