Skip to content

Files

Latest commit

 

History

History

elasticsearchexporter

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Elasticsearch Exporter

Status
Stability development: metrics, profiles
beta: traces, logs
Distributions contrib
Issues Open issues Closed issues
Code Owners @JaredTan95, @carsonip, @lahsivjar

This exporter supports sending logs, metrics, traces and profiles to Elasticsearch.

The Exporter is API-compatible with Elasticsearch 7.17.x and 8.x. Certain features of the exporter, such as the otel mapping mode, may require newer versions of Elasticsearch. Limited effort will be made to support EOL versions of Elasticsearch -- see https://www.elastic.co/support/eol.

Configuration options

Exactly one of the following settings is required:

  • endpoint (no default): The target Elasticsearch URL to which data will be sent (e.g. https://elasticsearch:9200)
  • endpoints (no default): A list of Elasticsearch URLs to which data will be sent, attempted in round-robin order
  • cloudid (no default): The Elastic Cloud ID of the Elastic Cloud Cluster to which data will be sent (e.g. foo:YmFyLmNsb3VkLmVzLmlvJGFiYzEyMyRkZWY0NTY=)

When the above settings are missing, endpoints will default to the comma-separated ELASTICSEARCH_URL environment variable.

Elasticsearch credentials may be configured via Authentication configuration settings. As a shortcut, the following settings are also supported:

  • user (optional): Username used for HTTP Basic Authentication.
  • password (optional): Password used for HTTP Basic Authentication.
  • api_key (optional): Elasticsearch API Key in "encoded" format (e.g. VFR2WU41VUJIbG9SbGJUdVFrMFk6NVVhVDE3SDlSQS0wM1Rxb24xdXFldw==).

Example:

exporters:
  elasticsearch:
    endpoint: https://elastic.example.com:9200
    auth:
      authenticator: basicauth

extensions:
  basicauth:
    client_auth:
      username: elastic
      password: changeme

······

service:
  extensions: [basicauth]
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [elasticsearch]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [elasticsearch]

Advanced configuration

HTTP settings

The Elasticsearch exporter supports common HTTP Configuration Settings. Gzip compression is enabled by default. To disable compression, set compression to none. Default Compression Level is set to 1 (gzip.BestSpeed). As a consequence of supporting confighttp, the Elasticsearch exporter also supports common TLS Configuration Settings.

The Elasticsearch exporter sets timeout (HTTP request timeout) to 90s by default. All other defaults are as defined by confighttp.

Queuing

The Elasticsearch exporter supports the common sending_queue settings. However, the sending queue is currently disabled by default.

Batching

Warning

The batcher config is experimental and may change without notice.

The Elasticsearch exporter supports the common batcher settings.

  • batcher:
    • enabled (default=unset): Enable batching of requests into 1 or more bulk requests. On a batcher flush, it is possible for a batched request to be translated to more than 1 bulk request due to flush::bytes.
    • sizer (default=items): Unit of min_size and max_size. Currently supports only "items", in the future will also support "bytes".
    • min_size (default=5000): Minimum batch size to be exported to Elasticsearch, measured in units according to batcher::sizer.
    • max_size (default=0): Maximum batch size to be exported to Elasticsearch, measured in units according to batcher::sizer. To limit bulk request size, configure flush::bytes instead. ⚠️ It is recommended to keep max_size as 0 as a non-zero value may lead to broken metrics grouping and indexing rejections.
    • min_size_items (DEPRECATED, use batcher::min_size instead): Minimum number of log records / spans / data points in the batched request to immediately trigger a batcher flush.
    • max_size_items (DEPRECATED, use batcher::max_size instead): Maximum number of log records / spans / data points in a batched request.
    • flush_timeout (default=30s): Maximum time of the oldest item spent inside the batcher buffer, aka "max age of batcher buffer". A batcher flush will happen regardless of the size of content in batcher buffer.

By default, the exporter will perform its own buffering and batching, as configured through the flush config, and batcher will be unused. By setting batcher::enabled to either true or false, the exporter will not perform any of its own buffering or batching, and the flush::interval config will be ignored. In a future release when the batcher config is stable, and has feature parity with the exporter's existing flush config, it will be enabled by default.

Using the common batcher functionality provides several benefits over the default behavior:

  • Combined with a persistent queue, or no queue at all, batcher enables at least once delivery. With the default behavior, the exporter will accept data and process it asynchronously, which interacts poorly with queuing.
  • By ensuring the exporter makes requests to Elasticsearch synchronously, client metadata can be passed through to Elasticsearch requests, e.g. by using the headers_setter extension.

Elasticsearch document routing

Telemetry data will be written to signal specific data streams by default: logs to logs-generic-default, metrics to metrics-generic-default, and traces to traces-generic-default. This can be customised through the following settings:

  • logs_index: The index or data stream name to publish events to. The default value is logs-generic-default

  • logs_dynamic_index (optional): uses resource, scope, or log record attributes to dynamically construct index name.

    • enabled(default=false): Enable/Disable dynamic index for log records. If data_stream.dataset or data_stream.namespace exist in attributes (precedence: log record attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form logs-${data_stream.dataset}-${data_stream.namespace}. In a special case with mapping::mode: bodymap, data_stream.type field (valid values: logs, metrics) is also supported to dynamically construct index in the form ${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}. Otherwise, if elasticsearch.index.prefix or elasticsearch.index.suffix exist in attributes (precedence: resource attribute > scope attribute > log record attribute), they will be used to dynamically construct index name in the form ${elasticsearch.index.prefix}${logs_index}${elasticsearch.index.suffix}. Otherwise, if scope name matches regex /receiver/(\w*receiver), data_stream.dataset will be capture group #1. Otherwise, the index name falls back to logs-generic-default, and logs_index config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding data_stream.* fields, see restrictions applied to Data Stream Fields.
  • metrics_index (optional): The index or data stream name to publish metrics to. The default value is metrics-generic-default. ⚠️ Note that metrics support is currently in development.

  • metrics_dynamic_index (optional): uses resource, scope or data point attributes to dynamically construct index name. ⚠️ Note that metrics support is currently in development.

    • enabled(default=true): Enable/disable dynamic index for metrics. If data_stream.dataset or data_stream.namespace exist in attributes (precedence: data point attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form metrics-${data_stream.dataset}-${data_stream.namespace}. Otherwise, if elasticsearch.index.prefix or elasticsearch.index.suffix exist in attributes (precedence: resource attribute > scope attribute > data point attribute), they will be used to dynamically construct index name in the form ${elasticsearch.index.prefix}${metrics_index}${elasticsearch.index.suffix}. Otherwise, if scope name matches regex /receiver/(\w*receiver), data_stream.dataset will be capture group #1. Otherwise, the index name falls back to metrics-generic-default, and metrics_index config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding data_stream.* fields, see restrictions applied to Data Stream Fields.
  • traces_index: The index or data stream name to publish traces to. The default value is traces-generic-default.

  • traces_dynamic_index (optional): uses resource, scope, or span attributes to dynamically construct index name.

    • enabled(default=false): Enable/Disable dynamic index for trace spans. If data_stream.dataset or data_stream.namespace exist in attributes (precedence: span attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form traces-${data_stream.dataset}-${data_stream.namespace}. Otherwise, if elasticsearch.index.prefix or elasticsearch.index.suffix exist in attributes (precedence: resource attribute > scope attribute > span attribute), they will be used to dynamically construct index name in the form ${elasticsearch.index.prefix}${traces_index}${elasticsearch.index.suffix}. Otherwise, if scope name matches regex /receiver/(\w*receiver), data_stream.dataset will be capture group #1. Otherwise, the index name falls back to traces-generic-default, and traces_index config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding data_stream.* fields, see restrictions applied to Data Stream Fields. There is an exception for span events under OTel mapping mode (mapping::mode: otel), where span event attributes instead of span attributes are considered, and data_stream.type is always logs instead of traces such that documents are routed to logs-${data_stream.dataset}-${data_stream.namespace}.
  • logstash_format (optional): Logstash format compatibility. Logs, metrics and traces can be written into an index in Logstash format.

    • enabled(default=false): Enable/disable Logstash format compatibility. When logstash_format.enabled is true, the index name is composed using (logs|metrics|traces)_index or (logs|metrics|traces)_dynamic_index as prefix and the date as suffix, e.g: If logs_index or logs_dynamic_index is equal to logs-generic-default, your index will become logs-generic-default-YYYY.MM.DD. The last string appended belongs to the date when the data is being generated.
    • prefix_separator(default=-): Set a separator between logstash_prefix and date.
    • date_format(default=%Y.%m.%d): Time format (based on strftime) to generate the second part of the Index name.
  • logs_dynamic_id (optional): Dynamically determines the document ID to be used in Elasticsearch based on a log record attribute.

    • enabled(default=false): Enable/Disable dynamic ID for log records. If elasticsearch.document_id exists and is not an empty string in the log record attributes, it will be used as the document ID. Otherwise, the document ID will be generated by Elasticsearch. The attribute elasticsearch.document_id is removed from the final document when the otel mapping mode is used. See Setting a document id dynamically.

Elasticsearch document mapping

The Elasticsearch exporter supports several document schemas and preprocessing behaviours, which may be configured through the following settings:

  • mapping:
    • mode (default=otel): The default mapping mode. Valid modes are:
      • none
      • ecs
      • otel
      • raw
      • bodymap
    • allowed_modes (defaults to all mapping modes): A list of allowed mapping modes.

The mapping mode can also be controlled via the client metadata key X-Elastic-Mapping-Mode, e.g. via HTTP headers, gRPC metadata. This will override the configured mapping::mode. It is possible to restrict which mapping modes may be requested by configuring mapping::allowed_modes, which defaults to all mapping modes. Keep in mind that not all processors or exporter configurations will maintain client metadata.

See below for a description of each mapping mode.

OTel mapping mode

The default and recommended "OTel-native" mapping mode.

Requires Elasticsearch 8.12 or above1, works best with Elasticsearch 8.16 or above2.

In otel mapping mode, the Elasticsearch Exporter stores documents in Elastic's preferred "OTel-native" schema. In this mapping mode, documents use the original attribute names and closely follows the event structure from the OTLP events.

There is special treatment for the following attributes: data_stream.type, data_stream.dataset, and data_stream.namespace. Instead of serializing these values under the *attributes.* namespace, they are put at the root of the document, to conform with the conventions of the data stream naming scheme that maps these as constant_keyword fields.

data_stream.dataset will always be appended with .otel if dynamic data stream routing mode is active.

Span events are stored in separate documents. They will be routed with data_stream.type set to logs if traces_dynamic_index::enabled is true.

Signal Supported
Logs
Traces
Metrics
Profiles

ECS mapping mode

Warning

The ECS mode mapping mode is currently undergoing changes, and its behaviour is unstable.

In ecs mapping mode, the Elasticsearch Exporter maps fields from OpenTelemetry Semantic Conventions (version 1.22.0) to Elastic Common Schema where possible. This mode may be used for compatibility with existing dashboards that work with ECS.

Signal ecs
Logs
Traces
Metrics
Profiles 🚫

Bodymap mapping mode

Warning

The Bodymap mode mapping mode is currently undergoing changes, and its behaviour is unstable.

In bodymap mapping mode, the Elasticsearch Exporter supports only logs and will take the "body" of a log record as the exact content of the Elasticsearch document without any transformation. This mapping mode is intended for use cases where the client wishes to have complete control over the Elasticsearch document structure.

Signal bodymap
Logs
Traces 🚫
Metrics 🚫
Profiles 🚫

Default (none) mapping mode

In the none mapping mode the Elasticsearch Exporter produces documents with the original field names of from the OTLP data structures.

Signal none
Logs
Traces
Metrics 🚫
Profiles 🚫

Raw mapping mode

The raw mapping mode is identical to none, except for two differences:

  • In none mode attributes are mapped with an Attributes. prefix, while in raw mode they are not.
  • In none mode span events are mapped with an Events. prefix, while in raw mode they are not.
Signal raw
Logs
Traces
Metrics 🚫
Profiles 🚫

Elasticsearch ingest pipeline

Documents may be optionally passed through an Elasticsearch Ingest pipeline prior to indexing. This can be configured through the following settings:

  • pipeline (optional): ID of an Elasticsearch Ingest pipeline used for processing documents published by the exporter.
  • logs_dynamic_pipeline (optional): Dynamically determines the ingest pipeline to be used in Elasticsearch based on attributes in the log signal.
    • enabled(default=false): Enable/Disable dynamic pipeline. If elasticsearch.ingest_pipeline attribute exists in the log record attributes and is not an empty string, it will be used as the Elasticsearch ingest pipeline. This currently only applies to the log signal. The attribute elasticsearch.ingest_pipeline is removed from the final document when the otel mapping mode is used.

Elasticsearch bulk indexing

The Elasticsearch exporter uses the Elasticsearch Bulk API for indexing documents. The behaviour of this bulk indexing can be configured with the following settings:

  • num_workers (default=runtime.NumCPU()): Number of workers publishing bulk requests concurrently.
  • flush: Event bulk indexer buffer flush settings
    • bytes (default=5000000): Write buffer flush size limit before compression. A bulk request will be sent immediately when its buffer exceeds this limit. This value should be much lower than Elasticsearch's http.max_content_length config to avoid HTTP 413 Entity Too Large error. It is recommended to keep this value under 5MB.
    • interval (default=30s): Write buffer flush time limit.
  • retry: Elasticsearch bulk request retry settings
    • enabled (default=true): Enable/Disable request retry on error. Failed requests are retried with exponential backoff.
    • max_requests (DEPRECATED, use retry::max_retries instead): Number of HTTP request retries including the initial attempt. If used, retry::max_retries will be set to max_requests - 1.
    • max_retries (default=2): Number of HTTP request retries. To disable retries, set retry::enabled to false instead of setting max_retries to 0.
    • initial_interval (default=100ms): Initial waiting time if a HTTP request failed.
    • max_interval (default=1m): Max waiting time if a HTTP request failed.
    • retry_on_status (default=[429]): Status codes that trigger request or document level retries. Request level retry and document level retry status codes are shared and cannot be configured separately. To avoid duplicates, it defaults to [429].

Note

The flush::interval config will be ignored when batcher::enabled config is explicitly set to true or false.

Elasticsearch node discovery

The Elasticsearch Exporter will regularly check Elasticsearch for available nodes. Newly discovered nodes will automatically be used for load balancing. Settings related to node discovery are:

  • discover:
    • on_start (optional): If enabled the exporter queries Elasticsearch for all known nodes in the cluster on startup.
    • interval (optional): Interval to update the list of Elasticsearch nodes.

Node discovery can be disabled by setting discover.interval to 0.

Telemetry settings

The Elasticsearch Exporter's own telemetry settings for testing and debugging purposes.

⚠️ This is experimental and may change at any time.

  • telemetry:
    • log_request_body (default=false): Logs Elasticsearch client request body as a field in a log line at DEBUG level. It requires service::telemetry::logs::level to be set to debug. WARNING: Enabling this config may expose sensitive data.
    • log_response_body (default=false): Logs Elasticsearch client response body as a field in a log line at DEBUG level. It requires service::telemetry::logs::level to be set to debug. WARNING: Enabling this config may expose sensitive data.

Exporting metrics

Metrics support is currently in development. The metric types supported are:

  • Gauge
  • Sum
  • Histogram (Delta temporality only)
  • Exponential histogram (Delta temporality only)
  • Summary

Exporting profiles

Profiles support is currently in development, and should not be used in production. Profiles only support the OTel mapping mode.

Example:

exporters:
  elasticsearch:
    endpoint: https://elastic.example.com:9200
    mapping:
      mode: otel

Important

For the Elasticsearch Exporter to be able to export Profiles data, Universal Profiling needs to be installed in the database. See the Universal Profiling getting started documentation You will need to use the Elasticsearch endpoint, with an Elasticsearch API key.

ECS Mapping

elasticsearchexporter follows ECS mapping defined here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model-appendix.md#elastic-common-schema

When mode is set to ecs, elasticsearchexporter performs conversions for resource-level attributes from their Semantic Conventions (SemConv) names to equivalent Elastic Common Schema (ECS) names.

If the target ECS field name is specified as an empty string (""), the converter will neither convert the SemConv key to the equivalent ECS name nor pass through the SemConv key as-is to become the ECS name.

When "Preserved" is true, the attribute will be preserved in the payload and duplicated as mapped to its ECS equivalent.

Semantic Convention Name ECS Name Preserve
cloud.platform cloud.service.name false
container.image.tags container.image.tag false
deployment.environment service.environment false
host.arch host.architecture false
host.name host.hostname true
k8s.cluster.name orchestrator.cluster.name false
k8s.container.name kubernetes.container.name false
k8s.cronjob.name kubernetes.cronjob.name false
k8s.daemonset.name kubernetes.daemonset.name false
k8s.deployment.name kubernetes.deployment.name false
k8s.job.name kubernetes.job.name false
k8s.namespace.name kubernetes.namespace false
k8s.node.name kubernetes.node.name false
k8s.pod.name kubernetes.pod.name false
k8s.pod.uid kubernetes.pod.uid false
k8s.replicaset.name kubernetes.replicaset.name false
k8s.statefulset.name kubernetes.statefulset.name false
os.description host.os.full false
os.name host.os.name false
os.type host.os.platform false
os.version host.os.version false
process.executable.path process.executable false
process.runtime.name service.runtime.name false
process.runtime.version service.runtime.version false
service.instance.id service.node.name false
telemetry.distro.name "" false
telemetry.distro.version "" false
telemetry.sdk.language "" false
telemetry.sdk.name "" false
telemetry.sdk.version "" false

Compound Mapping

There are ECS fields that are not mapped easily 1 to 1 but require more advanced logic.

agent.name

The agent name takes the form of a compound name consisting of 3 components:

  • telemetry.sdk.name or, if not present, defaults to otlp,
  • telemetry.sdk.language, defaulting to unknown in case it is missing,
  • telemetry.distro.name, which is allowed to be empty.

These values are all valid:

telemetry.sdk.name telemetry.sdk.language telemetry.distro.name agent.name
"" "" "" otlp/unknown
"" dotnet "" otlp/dotnet
opentelemetry dotnet "" opentelemetry/dotnet
"" java parts-unlimited-java otlp/java/parts-unlimited-java
"" "" parts-unlimited-java otlp/unknown/parts-unlimited-java

agent.version

Takes the value of telemetry.distro.version or telemetry.sdk.version. If both telemetry.distro.version and telemetry.sdk.version are present, telemetry.distro.version takes precedence.

host.os.type

Maps values of os.type in the following manner:

SemConv Value ECS Value
windows windows
linux linux
darwin macos
aix unix
hpux unix
solaris unix

In case os.name is present and falls within the specified range of values:

SemConv Value ECS Value
Android android
iOS ios

Otherwise, it is mapped to an empty string ("").

@timestamp

In case the record contains timestamp, this value is used. Otherwise, the observed timestamp is used.

Setting a document id dynamically

The logs_dynamic_id setting allows users to set the document ID dynamically based on a log record attribute. Besides the ability to control the document ID, this setting also works as a deduplication mechanism, as Elasticsearch will refuse to index a document with the same ID.

The log record attribute elasticsearch.document_id can be set explicitly by a processor based on the log record.

As an example, the transform processor can create this attribute dynamically:

processors:
  transform/es-doc-id:
    error_mode: ignore
    log_statements:
      - context: log
        condition: attributes["event_name"] != null && attributes["event_creation_time"] != null
        statements:
          - set(attributes["elasticsearch.document_id"], Concat(["log", attributes["event_name"], attributes["event_creation_time"], "-"))

Known issues

version_conflict_engine_exception

Symptom: elasticsearchexporter logs an error "failed to index document" with error.type "version_conflict_engine_exception" and error.reason containing "version conflict, document already exists".

This happens when the target data stream is a TSDB metrics data stream (e.g. using OTel mapping mode sending to a 8.16+ Elasticsearch, or ECS mapping mode sending to system integration data streams).

Elasticsearch Time Series Data Streams requires that there must only be one document per timestamp with the same dimensions. The purpose is to avoid duplicate data when re-trying a batch of metrics that were previously sent but failed to be indexed. The dimensions are mostly made up of resource attributes, scope attributes, scope name, attributes, and the unit.

The exporter can only group metrics with the same dimensions into the same document if they arrive in the same batch. To ensure metrics are not dropped even if they arrive in different batches in the exporter, the exporter adds a fingerprint of the metric names to the document in the otel mapping mode. Note that you'll need to be on a minimum version of Elasticsearch in order for this to take effect 8.16.5, 8.17.3, 8.19.0, 9.0.0. If you are on an earlier version, either update your Elasticsearch cluster or install this custom component template:

PUT _component_template/metrics-otel@custom
{
  "template": {
    "mappings": {
      "properties": {
        "_metric_names_hash": {
          "type": "keyword",
          "time_series_dimension": true
        }
      }
    }
  }
}

While in most situations, this error is just a sign that Elasticsearch's duplicate detection is working as intended, the data may be classified as a duplicate while it was not. This implies data is lost.

  1. If the data is not sent in otel mapping mode to metrics-*.otel-* data streams, the metrics name fingerprint is not applied. This can happen for OTel host and k8s metrics that the elasticinframetricsprocessor has translated to the format the host and k8s dashboards in Kibana can consume. If these metrics arrive in the elasticsearchexporter in different batches, they will not be grouped to the same document. This can cause the version_conflict_engine_exception error. Try to remove the batchprocessor from the pipeline (or set send_batch_max_size: 0) to ensure metrics are not split into different batches. This gives the exporter the opportunity to group all related metrics into the same document.

  2. Otherwise, check your metrics pipeline setup for misconfiguration that causes an actual violation of the single writer principle. This means that the same metric with the same dimensions is sent from multiple sources, which is not allowed in the OTel metrics data model.

flush failed (400) illegal_argument_exception

Symptom: bulk indexer logs an error that indicates "bulk indexer flush error" with bulk request returning HTTP 400 and an error type of illegal_argument_exception, similar to the following.

error   elasticsearchexporter@v0.120.1/bulkindexer.go:343       bulk indexer flush error        {"otelcol.component.id": "elasticsearch", "otelcol.component.kind": "Exporter", "otelcol.signal": "logs", "error": "flush failed (400): {\"error\":{\"type\":\"illegal_argument_exception\",\"caused_by\":{}}}"}

This may happen when you use OTel mapping mode (the default mapping mode from v0.122.0, or explicitly by configuring mapping::mode: otel) sending to Elasticsearch version < 8.12.

To resolve this, it is recommended to upgrade your Elasticsearch to 8.12+, ideally 8.16+. Alternatively, try other mapping modes, but the document structure will be different.

Footnotes

  1. as it uses the undocumented require_data_stream bulk API parameter supported from Elasticsearch 8.12

  2. Elasticsearch 8.16 contains a built-in otel-data plugin