Troubleshooting Elasticsearch Upgrades by stefnestor · Pull Request #6396 · elastic/docs-content

stefnestor · 2026-05-09T15:03:09Z

Summary

Follow-up of #5848 to create separate troubleshooting page.

Generative AI disclosure

Did you use a generative AI (GenAI) tool to assist in creating this contribution?

Yes
No

github-actions · 2026-05-09T15:03:33Z

Elastic Docs AI PR menu

Check the box to run an AI review for this pull request.

Review docs changes (docs-review). Status: completed. View progress.

Powered by GitHub Agentic Workflows and docs-actions. For more information, reach out to the docs team.

github-actions

Docs review summary

Focus areas

Style and clarity: Typos found in two files (earlier ersion in elasticsearch.md, due dilligence in the new file); grammar issue in discovery-troubleshooting.md. "Kindly" should be dropped per style guide.
Jargon: No unexplained jargon introduced.
Frontmatter and applies_to: applies_to: stack: is missing a lifecycle value (e.g., ga) in the new file — this will likely fail validation or render incorrectly.
Content type fit: The new page is declared type: troubleshooting, which is appropriate. However, the required Symptoms and Resolution sections are entirely unfilled template placeholders. The Resolution section still contains the literal template stepper block. The page is linked from the upgrade guide, meaning users who follow that tip will land on an incomplete page.
Parent issue satisfaction: This PR is a follow-up to #5848. The page structure is in place but the core content (Symptoms, Resolution) is not yet written, so the issue is partially satisfied.

Notes

The empty ## heading on line 18 of troubleshooting-upgrades.md will likely cause a build or rendering failure — this is the highest-priority fix.
Vale found no findings (eligible-files list was empty in the pre-fetch), so no Vale-sourced nits to report.

Generated by Docs review agent for issue #6396 · ● 325.5K

github-actions · 2026-05-10T17:56:55Z

+description: "Common upgrade issues and resolutions."
+type: troubleshooting
+applies_to:
+  stack:


applies_to: stack: is missing a lifecycle value. Refer to the cumulative-docs reference for valid values (e.g., ga, beta, coming). Without a value this will likely render incorrectly or fail validation.

github-actions · 2026-05-10T17:56:55Z

+
+# Troubleshoot upgrades [troubleshooting-upgrades]
+
+Usually, [{{es}} upgrades](/deploy-manage/upgrade/deployment-or-cluster/elasticsearch.md) proceed smoothly due to [planning](/deploy-manage/upgrade/plan-upgrade.md) and [preparation](/deploy-manage/upgrade/prepare-to-upgrade.md) due dilligence.


Typo: due dilligence → due diligence.

Suggested change

Usually, [{{es}} upgrades](/deploy-manage/upgrade/deployment-or-cluster/elasticsearch.md) proceed smoothly due to [planning](/deploy-manage/upgrade/plan-upgrade.md) and [preparation](/deploy-manage/upgrade/prepare-to-upgrade.md) due dilligence.

Usually, [{{es}} upgrades](/deploy-manage/upgrade/deployment-or-cluster/elasticsearch.md) proceed smoothly due to [planning](/deploy-manage/upgrade/plan-upgrade.md) and [preparation](/deploy-manage/upgrade/prepare-to-upgrade.md) due diligence.

github-actions · 2026-05-10T17:56:56Z

 * _(Recommended)_ **A rolling restart**

-    This option allows you to upgrade your cluster one node at a time without interrupting service. Running multiple versions of {{es}} in the same cluster beyond the duration of an upgrade is not supported, as shards cannot be replicated from upgraded nodes to nodes running the old-version. Running more than two versions of {{es}} in the same cluster is not supported.
+    This option allows you to upgrade your cluster one node at a time without interrupting service. Running multiple versions of {{es}} in the same cluster beyond the duration of an upgrade is not supported, as shards cannot be replicated from upgraded nodes to nodes running the earlier ersion. Running more than two versions of {{es}} in the same cluster is not supported.


Typo: earlier ersion should be earlier version.

Suggested change

This option allows you to upgrade your cluster one node at a time without interrupting service. Running multiple versions of {{es}} in the same cluster beyond the duration of an upgrade is not supported, as shards cannot be replicated from upgraded nodes to nodes running the earlier ersion. Running more than two versions of {{es}} in the same cluster is not supported.

This option allows you to upgrade your cluster one node at a time without interrupting service. Running multiple versions of {{es}} in the same cluster beyond the duration of an upgrade is not supported, as shards cannot be replicated from upgraded nodes to nodes running the earlier version. Running more than two versions of {{es}} in the same cluster is not supported.

github-actions · 2026-05-10T17:56:56Z

-Master elections only involve master-eligible nodes, so focus your attention on the master-eligible nodes in this situation. These nodes' logs indicate the requirements for a master election, such as the discovery of a certain set of nodes. The [Health]({{es-apis}}operation/operation-health-report) API on these nodes also provides useful information about the situation.
+If there is no elected master node and no node can win an election, all nodes repeatedly log messages about the problem using a [logger](/deploy-manage/monitor/logging-configuration.md) called `org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper`. By default, this happens every 10 seconds. 
+
+During this time the {{es}} will induce `MasterNotDiscoveredException` errors and which its API will report like:


Grammar issue: the {{es}} will induce ... errors and which its API will report like: has two problems — the article "the" before {{es}} and the spurious "and which".

Suggested rewrite:

Suggested change

During this time the {{es}} will induce `MasterNotDiscoveredException` errors and which its API will report like:

During this time, {{es}} returns `MasterNotDiscoveredException` errors. Its API reports:

github-actions · 2026-05-10T17:56:56Z

+This guide outlines {{es}} logs which indicate either upgrade blocking issues or fatal node start-up errors.
+
+
+## 


This H2 heading is empty (## with no title). It should either be given a title (e.g., ## Monitor upgrade progress) or removed. An untitled heading will also likely fail docs build validation.

github-actions · 2026-05-10T17:56:56Z

+{{es}} maintains the data in the data paths of the older nodes and will recover the cluster to health using this data after the nodes are fully upgraded. Therefore, to bring these nodes back into the cluster, upgrade them.
+
+:::{note} :applies_to: { ece:, ess: }
+Usually you can "Reapply" your latest [Deployment activity](/deploy-manage/deploy/elastic-cloud/keep-track-of-deployment-activity.md) {{es}} upgrade to finish upgrading. If the node out of cluster causes [Cluster health](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-health) status of `red`, then plans will be blocked for data safety. If this is the case, kindly [contact us](/troubleshoot/index.md#contact-us) with {{ech}} deployment ID or [{{ece}} diagnostic](/troubleshoot/deployments/cloud-enterprise/run-ece-diagnostics-tool.md) flagged `--deployments` for problematic deployment.


Avoid "kindly" — the Elastic style guide treats it the same as "please", which should be omitted unless asking users to wait or tolerate inconvenience.

Suggested change

Usually you can "Reapply" your latest [Deployment activity](/deploy-manage/deploy/elastic-cloud/keep-track-of-deployment-activity.md) {{es}} upgrade to finish upgrading. If the node out of cluster causes [Cluster health](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-health) status of `red`, then plans will be blocked for data safety. If this is the case, kindly [contact us](/troubleshoot/index.md#contact-us) with {{ech}} deployment ID or [{{ece}} diagnostic](/troubleshoot/deployments/cloud-enterprise/run-ece-diagnostics-tool.md) flagged `--deployments` for problematic deployment.

Usually you can "Reapply" your latest [Deployment activity](/deploy-manage/deploy/elastic-cloud/keep-track-of-deployment-activity.md) {{es}} upgrade to finish upgrading. If the node out of cluster causes [Cluster health](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-health) status of `red`, then plans will be blocked for data safety. If this is the case, [contact us](/troubleshoot/index.md#contact-us) with {{ech}} deployment ID or [{{ece}} diagnostic](/troubleshoot/deployments/cloud-enterprise/run-ece-diagnostics-tool.md) flagged `--deployments` for problematic deployment.

github-actions · 2026-05-10T17:56:56Z

+Avoid linking to GitHub issues, pull requests, or internal discussions. Resources should be stable, user-facing documentation.
+-->
+
+- [Related documentation link]


Placeholder links should be filled in with real targets or removed before publishing:

[Related documentation link]

[Contrib/upstream reference]

github-actions · 2026-05-10T17:56:56Z

+In a testing or development environment with only one or two master-eligible nodes, you cannot avoid stopping half or more of the master-eligible nodes, so the cluster will always become unavailable at some point during the upgrade. When you restart the master-eligible nodes after this unavailability, the cluster will re-form with a single upgraded node, which is therefore fully-upgraded and will reject older nodes' attempts to re-join the cluster. Upgrade the master-eligible nodes last to avoid these rejections.
+
+
+## Symptoms


The required Symptoms and Resolution sections (and the optional Diagnosis, Best practices, Resources sections) contain only template placeholder comments. The resolution section still has the literal stepper code block from the template. These need to be filled in before the page goes live — the page is currently non-functional for users who land on it from the link added in deploy-manage/upgrade/deployment-or-cluster/elasticsearch.md.

Troubleshooting Elasticsearch Upgrades

50322de

stefnestor requested review from a team as code owners May 9, 2026 15:03

stefnestor marked this pull request as draft May 9, 2026 15:03

Merge branch 'main' into stef_esUpgradeIssues

4fd0311

github-actions Bot reviewed May 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshooting Elasticsearch Upgrades#6396

Troubleshooting Elasticsearch Upgrades#6396
stefnestor wants to merge 2 commits into
elastic:mainfrom
stefnestor:stef_esUpgradeIssues

stefnestor commented May 9, 2026

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

github-actions Bot May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		# Troubleshoot upgrades [troubleshooting-upgrades]

		Usually, [{{es}} upgrades](/deploy-manage/upgrade/deployment-or-cluster/elasticsearch.md) proceed smoothly due to [planning](/deploy-manage/upgrade/plan-upgrade.md) and [preparation](/deploy-manage/upgrade/prepare-to-upgrade.md) due dilligence.

	This option allows you to upgrade your cluster one node at a time without interrupting service. Running multiple versions of {{es}} in the same cluster beyond the duration of an upgrade is not supported, as shards cannot be replicated from upgraded nodes to nodes running the earlier ersion. Running more than two versions of {{es}} in the same cluster is not supported.
	This option allows you to upgrade your cluster one node at a time without interrupting service. Running multiple versions of {{es}} in the same cluster beyond the duration of an upgrade is not supported, as shards cannot be replicated from upgraded nodes to nodes running the earlier version. Running more than two versions of {{es}} in the same cluster is not supported.

	During this time the {{es}} will induce `MasterNotDiscoveredException` errors and which its API will report like:
	During this time, {{es}} returns `MasterNotDiscoveredException` errors. Its API reports:

		This guide outlines {{es}} logs which indicate either upgrade blocking issues or fatal node start-up errors.


		##

		In a testing or development environment with only one or two master-eligible nodes, you cannot avoid stopping half or more of the master-eligible nodes, so the cluster will always become unavailable at some point during the upgrade. When you restart the master-eligible nodes after this unavailability, the cluster will re-form with a single upgraded node, which is therefore fully-upgraded and will reject older nodes' attempts to re-join the cluster. Upgrade the master-eligible nodes last to avoid these rejections.


		## Symptoms

Conversation

stefnestor commented May 9, 2026

Summary

Generative AI disclosure

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Elastic Docs AI PR menu

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Docs review summary

Focus areas

Notes

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 9, 2026 •

edited

Loading