Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 30 additions & 8 deletions docs/en/observability/apm/known-issues.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,28 @@ _Versions: XX.XX.XX, YY.YY.YY, ZZ.ZZ.ZZ_
// If applicable, link to fix
////

[discrete]
== APM occasionally returning HTTP 502 "backend connection closed" or "use of closed network connection"

_Elastic Stack versions: >=8.0.0 and <8.18.8 or <8.19.5, >=9.0.0 and <9.0.8 or <9.1.5_
_Environments: ECH, ECE

APM Server on ECH and ECE might sometimes return HTTP 502 with error message "backend connection closed" or "use of closed network connection" for any requests due to a rare race condition.
When this happens to an intake request, Elastic APM agents will log an error but will not retry, leading to data loss.

Note that there may be other causes to "backend connection closed" or "use of closed network connection", and the provided workaround and released bugfix will only resolve the case related to the mentioned race condition.

*Workaround*

To work around this issue:

* Go to *Kibana* > *Fleet* > *Elastic Cloud agent policy*,
* Next to *Elastic APM*, select the *...* icon, then *Edit Integration*.
* Under *General*, select *Advanced options*, then change *Idle time before underlying connection is closed* to *200s*.
* Select *Save Integration*

This bug will be fixed in 8.18.7, 8.19.4, 9.0.7, 9.1.4 for new deployments, and 8.18.8, 8.19.5, 9.0.8, 9.1.5, 9.2.0 for upgraded deployments.

[discrete]
== APM Integration might be unreachable after upgrading to 8.19.0 and 9.1.0

Expand Down Expand Up @@ -99,18 +121,18 @@ PUT _component_template/metrics-apm.internal@custom
== `prefer_ilm` required in component templates to create custom lifecycle policies

_Elastic Stack versions: 8.15.1+_

// The conditions in which this issue occurs
The issue occurs when creating a _new_ cluster using version 8.15.1+.
The issue occurs for any APM data streams created in 8.15.1+.
The issue does _not_ occur if custom component template has been created in or before version 8.15.0.

// Describe why it happens
In 8.15.0, APM Server began using the https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/apm-data[apm-data plugin]
to manage data streams, ingest pipelines, lifecycle policies, and more. In 8.15.1, a fix was introduced to address
unmanaged indices in older clusters using default ILM policies. This fix added a fallback to the default ILM policy
(if it exists) and set the `prefer_ilm` configuration to `false`. This setting impacts clusters where both ILM and
data stream lifecycles (DSL) are in effect—such as when configuring custom ILM policies using `@custom` component
In 8.15.0, APM Server began using the https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/apm-data[apm-data plugin]
to manage data streams, ingest pipelines, lifecycle policies, and more. In 8.15.1, a fix was introduced to address
unmanaged indices in older clusters using default ILM policies. This fix added a fallback to the default ILM policy
(if it exists) and set the `prefer_ilm` configuration to `false`. This setting impacts clusters where both ILM and
data stream lifecycles (DSL) are in effect—such as when configuring custom ILM policies using `@custom` component
templates, under the conditions mentioned above.

// How to fix it
Expand All @@ -122,7 +144,7 @@ to `true` by following the {observability-guide}/apm-ilm-how-to.html[updated gui

_Elastic Stack versions: 8.15.0, 8.15.1, 8.15.2, 8.15.3_ +
_Fixed in Elastic Stack version 8.15.4_

// The conditions in which this issue occurs
The issue only occurs when _upgrading_ the {stack} from 8.12.2 or lower directly to any 8.15.x version prior to 8.15.4.
The issue does _not_ occur when creating a _new_ cluster using any 8.15.x version, or when upgrading
Expand All @@ -132,7 +154,7 @@ from 8.12.2 to 8.13.x or 8.14.x and then to 8.15.x.
In APM Servers versions prior to 8.13.0, an ingestion pipeline exists to perform a check on the version.
The version check would fail any APM document produced with a different version of APM server compared to the version of the installed APM’s ingest pipeline.
In 8.13.0 the version check in the ingest pipeline was removed.
Due to the combination of an internal change in how apm data management assets are set up from 8.15 onwards and a bug in Elasticsearch,
Due to the combination of an internal change in how apm data management assets are set up from 8.15 onwards and a bug in Elasticsearch,
related to https://github.com/elastic/elasticsearch/issues/112781[lazy rollover of data streams], the ingestion pipeline conducting the version check is not removed on upgrade and prevents the ingestion of data.

// How to fix it
Expand Down