Skip to content

Files

Latest commit

 

History

History

transformprocessor

Transform Processor

Status
Stability alpha: traces, metrics, logs
Distributions contrib, k8s
Warnings Unsound Transformations, Identity Conflict, Orphaned Telemetry, Other
Issues Open issues Closed issues
Code Owners @TylerHelmuth, @kentquirk, @bogdandrutu, @evan-bradley, @edmocosta

Note

This documentation applies only to version 0.120.0 and later. Configuration from previous version is still supported, but no longer documented in this README. For information on earlier versions, please refer to the previous documentation.

The Transform Processor modifies telemetry based on configuration using the OpenTelemetry Transformation Language (OTTL).

For each signal type, the processor takes a list of statements and executes them against the incoming telemetry, following the order specified in the configuration. Each statement can access and transform telemetry using functions, and allows the use of a condition to help decide whether the function should be executed.

Config

General Config

Note

If you don't know how to write OTTL statements yet, first see OTTL's Getting Started docs.

transform:
  error_mode: ignore
  <trace|metric|log>_statements: []

The Transform Processor's primary configuration section is broken down by signal (traces, metrics, and logs) and allows you to configure a list of statements for the processor to execute. The list can be made of:

  • OTTL statements. This option will meet most user's needs. See Basic Config for more details.
  • Objects, which allows users to apply configuration options to a specific list of statements. See Advanced Config for more details.

Within each <signal_statements> list, only certain OTTL Path prefixes can be used:

Signal Path Prefix Values
trace_statements resource, scope, span, and spanevent
metric_statements resource, scope, metric, and datapoint
log_statements resource, scope, and log

This means, for example, that you cannot use the Path span.attributes within the log_statements configuration section.

error_mode: determines how the processor treats errors that occur while processing a statement. If the top-level error_mode is not specified, propagate will be used. The top-level error_mode can be overridden at statement group level, offering more granular control over error handling. If the statement group error_mode is not specified, the top-level error_mode is applied.

error_mode description
ignore The processor ignores errors returned by statements, logs the error, and continues on to the next statement. This is the recommended mode.
silent The processor ignores errors returned by statements, does not log the error, and continues on to the next statement.
propagate The processor returns the error up the pipeline. This will result in the payload being dropped from the collector.

Basic Config

Note

If you don't know how to write OTTL statements yet, first see OTTL's Getting Started docs.

The basic configuration style allows you to configure OTTL statements as a list, without worrying about extra configurations.

This is the simplest way to configure the Transform Processor. If you need global conditions or specific error modes see Advanced Config.

Format:

transform:
  error_mode: ignore
  <trace|metric|log>_statements:
    - string
    - string
    - string

Example:

transform:
  error_mode: ignore
  trace_statements:
    - keep_keys(span.attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
    - replace_pattern(span.attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")
    - limit(span.attributes, 100, [])
    - truncate_all(span.attributes, 4096)
  metric_statements:
    - keep_keys(resource.attributes, ["host.name"])
    - truncate_all(resource.attributes, 4096)
    - set(metric.description, "Sum") where metric.type == "Sum"
    - convert_sum_to_gauge() where metric.name == "system.processes.count"
    - convert_gauge_to_sum("cumulative", false) where metric.name == "prometheus_metric"
  log_statements:
    - set(log.severity_text, "FAIL") where log.body == "request failed"
    - replace_all_matches(log.attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")
    - replace_all_patterns(log.attributes, "value", "/account/\\d{4}", "/account/{accountId}")
    - set(log.body, log.attributes["http.route"])

If you're interested in how OTTL parses these statements, see Context Inference.

Advanced Config

Note

If you don't know how to write OTTL statements yet, first see OTTL's Getting Started docs.

For more complex use cases you may need to use the Transform Processor's advanced configuration style to group related OTTL statements.

Format:

transform:
  error_mode: ignore
  <trace|metric|log>_statements:
    - context: string
      error_mode: propagate
      conditions: 
        - string
        - string
      statements:
        - string
        - string
        - string
    - context: string
      error_mode: silent
      statements:
        - string
        - string
        - string

error_mode: allows overriding the top-level error_mode. See General Config for details on how to configure error_mode.

conditions: a list comprised of multiple where clauses, which will be processed as global conditions for the accompanying set of statements. The conditions are ORed together, which means only one condition needs to evaluate to true in order for the statements (including their individual Where clauses) to be executed.

statements: a list of OTTL statements.

Example:

transform:
  error_mode: ignore
  metric_statements:
    - error_mode: propagate
      conditions:
        - metric.type == METRIC_DATA_TYPE_SUM
      statements:
        - set(metric.description, "Sum")

  log_statements:
    - conditions:
        - IsMap(log.body) and log.body["object"] != nil
      statements:
        - set(log.body, log.attributes["http.route"])

The Transform Processor will enforce that all the Paths, functions, and enums used in a group's statements are parsable. In some situations a combination of Paths, functions, or enums is not allowed. For example:

metric_statements:
  - statements:
    - convert_sum_to_gauge() where metric.name == "system.processes.count"
    - limit(datapoint.attributes, 100, ["host.name"])

In this configuration, the datapoint Path prefixed is used in the same group of statements as the convert_sum_to_gauge function. Since convert_sum_to_gauge can only be used with the metrics, not datapoints, but the list statements contains a reference to the datapoints via the datapoint Path prefix, the group of statements cannot be parsed.

The solution is to separate the statements into separate groups:

metric_statements:
  - statements:
    - limit(datapoint.attributes, 100, ["host.name"])
  - statements:
    - convert_sum_to_gauge() where metric.name == "system.processes.count" 

Alternatively, for simplicity, you can use the basic configuration style:

metric_statements:
  - limit(datapoint.attributes, 100, ["host.name"])
  - convert_sum_to_gauge() where metric.name == "system.processes.count" 

Context inference

Note

This is an advanced topic and is not necessary to get started using the Transform Processor. Read on if you're interested in how the Transform Processor parses your OTTL statements.

An OTTL Context defines which Paths, functions, and enums are available when parsing the statement. The Transform Processor automatically infers the OTTL Context based on the paths defined in a statement.

This inference is based on the Path names, functions, and enums present in the statements.

The inference happens automatically because Path names are prefixed with the Context name. For example:

metric_statements:
  - set(metric.description, "test passed") where datapoint.attributes["test"] == "pass"

In this configuration, the inferred Context value is datapoint, as it is the only Context that supports parsing both datapoint and metric Paths.

In the following example, the inferred Context is metric, as metric is the context capable of parsing both metric and resource data.

metric_statements:
  - set(resource.attributes["test"], "passed")
  - set(metric.description, "test passed")

The primary benefit of context inference is that it enhances the efficiency of statement processing by linking them to the most suitable context. This optimization ensures that data transformations are both accurate and performant, leveraging the hierarchical structure of contexts to avoid unnecessary iterations and improve overall processing efficiency. All of this happens automatically, leaving you to write OTTL statements without worrying about Context.

Grammar

You can learn more in-depth details on the capabilities and limitations of the OpenTelemetry Transformation Language used by the Transform Processor by reading about its grammar.

Supported functions:

These common functions can be used for any Signal.

In addition to the common OTTL functions, the processor defines its own functions to help with transformations specific to this processor:

Metrics only functions

convert_sum_to_gauge

convert_sum_to_gauge()

Converts incoming metrics of type "Sum" to type "Gauge", retaining the metric's datapoints. Noop for metrics that are not of type "Sum".

NOTE: This function may cause a metric to break semantics for Gauge metrics. Use at your own risk.

Examples:

  • convert_sum_to_gauge()

convert_gauge_to_sum

convert_gauge_to_sum(aggregation_temporality, is_monotonic)

Converts incoming metrics of type "Gauge" to type "Sum", retaining the metric's datapoints and setting its aggregation temporality and monotonicity accordingly. Noop for metrics that are not of type "Gauge".

aggregation_temporality is a string ("cumulative" or "delta") that specifies the resultant metric's aggregation temporality. is_monotonic is a boolean that specifies the resultant metric's monotonicity.

NOTE: This function may cause a metric to break semantics for Sum metrics. Use at your own risk.

Examples:

  • convert_gauge_to_sum("cumulative", false)

  • convert_gauge_to_sum("delta", true)

extract_count_metric

Note

This function supports Histograms, ExponentialHistograms and Summaries.

extract_count_metric(is_monotonic)

The extract_count_metric function creates a new Sum metric from a Histogram, ExponentialHistogram or Summary's count value. A metric will only be created if there is at least one data point.

is_monotonic is a boolean representing the monotonicity of the new metric.

The name for the new metric will be <original metric name>_count. The fields that are copied are: timestamp, starttimestamp, attributes, description, and aggregation_temporality. As metrics of type Summary don't have an aggregation_temporality field, this field will be set to AGGREGATION_TEMPORALITY_CUMULATIVE for those metrics.

The new metric that is created will be passed to all subsequent statements in the metrics statements list.

Warning

This function may cause a metric to break semantics for Sum metrics. Use only if you're confident you know what the resulting monotonicity should be.

Examples:

  • extract_count_metric(true)

  • extract_count_metric(false)

extract_sum_metric

Note

This function supports Histograms, ExponentialHistograms and Summaries.

extract_sum_metric(is_monotonic)

The extract_sum_metric function creates a new Sum metric from a Histogram, ExponentialHistogram or Summary's sum value. If the sum value of a Histogram or ExponentialHistogram data point is missing, no data point is added to the output metric. A metric will only be created if there is at least one data point.

is_monotonic is a boolean representing the monotonicity of the new metric.

The name for the new metric will be <original metric name>_sum. The fields that are copied are: timestamp, starttimestamp, attributes, description, and aggregation_temporality. As metrics of type Summary don't have an aggregation_temporality field, this field will be set to AGGREGATION_TEMPORALITY_CUMULATIVE for those metrics.

The new metric that is created will be passed to all subsequent statements in the metrics statements list.

Warning

This function may cause a metric to break semantics for Sum metrics. Use only if you're confident you know what the resulting monotonicity should be.

Examples:

  • extract_sum_metric(true)

  • extract_sum_metric(false)

convert_summary_count_val_to_sum

convert_summary_count_val_to_sum(aggregation_temporality, is_monotonic)

The convert_summary_count_val_to_sum function creates a new Sum metric from a Summary's count value.

aggregation_temporality is a string ("cumulative" or "delta") representing the desired aggregation temporality of the new metric. is_monotonic is a boolean representing the monotonicity of the new metric.

The name for the new metric will be <summary metric name>_count. The fields that are copied are: timestamp, starttimestamp, attributes, and description. The new metric that is created will be passed to all functions in the metrics statements list. Function conditions will apply.

NOTE: This function may cause a metric to break semantics for Sum metrics. Use at your own risk.

Examples:

  • convert_summary_count_val_to_sum("delta", true)

  • convert_summary_count_val_to_sum("cumulative", false)

convert_summary_sum_val_to_sum

convert_summary_sum_val_to_sum(aggregation_temporality, is_monotonic)

The convert_summary_sum_val_to_sum function creates a new Sum metric from a Summary's sum value.

aggregation_temporality is a string ("cumulative" or "delta") representing the desired aggregation temporality of the new metric. is_monotonic is a boolean representing the monotonicity of the new metric.

The name for the new metric will be <summary metric name>_sum. The fields that are copied are: timestamp, starttimestamp, attributes, and description. The new metric that is created will be passed to all functions in the metrics statements list. Function conditions will apply.

NOTE: This function may cause a metric to break semantics for Sum metrics. Use at your own risk.

Examples:

  • convert_summary_sum_val_to_sum("delta", true)

  • convert_summary_sum_val_to_sum("cumulative", false)

copy_metric

copy_metric(Optional[name], Optional[description], Optional[unit])

The copy_metric function copies the current metric, adding it to the end of the metric slice.

name is an optional string. description is an optional string. unit is an optional string.

The new metric will be exactly the same as the current metric. You can use the optional parameters to set the new metric's name, description, and unit.

NOTE: The new metric is appended to the end of the metric slice and therefore will be included in all the metric statements. It is a best practice to ALWAYS include a Where clause when copying a metric that WILL NOT match the new metric.

Examples:

  • copy_metric(name="http.request.status_code", unit="s") where metric.name == "http.status_code

  • copy_metric(desc="new desc") where metric.description == "old desc"

convert_exponential_histogram_to_histogram

Warning: The approach used in this function to convert exponential histograms to explicit histograms is not part of the OpenTelemetry Specification.

convert_exponential_histogram_to_histogram(distribution, [ExplicitBounds])

The convert_exponential_histogram_to_histogram function converts an ExponentialHistogram to an Explicit (normal) Histogram.

This function requires 2 arguments:

  • distribution - This argument defines the distribution algorithm used to allocate the exponential histogram datapoints into a new Explicit Histogram. There are 4 options:

    • upper - This approach identifies the highest possible value of each exponential bucket (the upper bound) and uses it to distribute the datapoints by comparing the upper bound of each bucket with the ExplicitBounds provided. This approach works better for small/narrow exponential histograms where the difference between the upper bounds and lower bounds are small.

      For example, Given:

      1. count = 10
      2. Boundaries: [5, 10, 15, 20, 25]
      3. Upper Bound: 15 Process:
      4. Start with zeros: [0, 0, 0, 0, 0]
      5. Iterate the boundaries and compare u p p e r = 15 with each boundary: - 15 > 5 (skip) - 15 > 10 (skip) - 15 <= 15 (allocate count to this boundary)
      6. Allocate count: [0, 0, 10, 0, 0]
      7. Final Counts: [0, 0, 10, 0, 0]
    • midpoint - This approach works in a similar way to the upper approach, but instead of using the upper bound, it uses the midpoint of each exponential bucket. The midpoint is identified by calculating the average of the upper and lower bounds. This approach also works better for small/narrow exponential histograms.

      The uniform and random distribution algorithms both utilise the concept of intersecting boundaries. Intersecting boundaries are any boundary in the boundaries array that falls between or on the lower and upper values of the Exponential Histogram boundaries. For Example: if you have an Exponential Histogram bucket with a lower bound of 10 and upper of 20, and your boundaries array is [5, 10, 15, 20, 25], the intersecting boundaries are 10, 15, and 20 because they lie within the range [10, 20].

    • uniform - This approach distributes the datapoints for each bucket uniformly across the intersecting ExplicitBounds. The algorithm works as follows:

      • If there are valid intersecting boundaries, the function evenly distributes the count across these boundaries.
      • Calculate the count to be allocated to each boundary.
      • If there is a remainder after dividing the count equally, it distributes the remainder by incrementing the count for some of the boundaries until the remainder is exhausted.

      For example Given:

      1. count = 10
      2. Exponential Histogram Bounds: [10, 20]
      3. Boundaries: [5, 10, 15, 20, 25]
      4. Intersecting Boundaries: [10, 15, 20]
      5. Number of Intersecting Boundaries: 3
      6. Using the formula: c o u n t / n u m O f I n t e r s e c t i o n s = 10 / 3 = 3 r 1

      Uniform Allocation:

      1. Start with zeros: [0, 0, 0, 0, 0]
      2. Allocate 3 to each: [0, 3, 3, 3, 0]
      3. Distribute remainder r 1: [0, 4, 3, 3, 0]
      4. Final Counts: [0, 4, 3, 3, 0]
    • random - This approach distributes the datapoints for each bucket randomly across the intersecting ExplicitBounds. This approach works in a similar manner to the uniform distribution algorithm with the main difference being that points are distributed randomly instead of uniformly. This works as follows:

      • If there are valid intersecting boundaries, calculate the proportion of the count that should be allocated to each boundary based on the overlap of the boundary with the provided range (lower to upper).
      • For each boundary, a random fraction of the calculated proportion is allocated.
      • Any remaining count (due to rounding or random distribution) is then distributed randomly among the intersecting boundaries.
      • If the bucket range does not intersect with any boundaries, the entire count is assigned to the start boundary.
  • ExplicitBounds represents the list of bucket boundaries for the new histogram. This argument is required and cannot be empty.

WARNINGS:

  • The process of converting an ExponentialHistogram to an Explicit Histogram is not perfect and may result in a loss of precision. It is important to define an appropriate set of bucket boundaries and identify the best distribution approach for your data in order to minimize this loss.

    For example, selecting Boundaries that are too high or too low may result histogram buckets that are too wide or too narrow, respectively.

  • Negative Bucket Counts are not supported in Explicit Histograms, as such negative bucket counts are ignored.

  • ZeroCounts are only allocated if the ExplicitBounds array contains a zero boundary. That is, if the Explicit Boundaries that you provide does not start with 0, the function will not allocate any zero counts from the Exponential Histogram.

This function should only be used when Exponential Histograms are not suitable for the downstream consumers or if upstream metric sources are unable to generate Explicit Histograms.

Example:

  • convert_exponential_histogram_to_histogram("random", [0.0, 10.0, 100.0, 1000.0, 10000.0])

scale_metric

scale_metric(factor, Optional[unit])

The scale_metric function multiplies the values in the data points in the metric by the float value factor. If the optional string unit is provided, the metric's unit will be set to this value. The supported data types are:

Supported metric types are Gauge, Sum, Histogram, and Summary.

Examples:

  • scale_metric(0.1): Scale the metric by a factor of 0.1. The unit of the metric will not be modified.
  • scale_metric(10.0, "kWh"): Scale the metric by a factor of 10.0 and sets the unit to kWh.

aggregate_on_attributes

aggregate_on_attributes(function, Optional[attributes])

The aggregate_on_attributes function aggregates all datapoints in the metric based on the supplied attributes. function is a case-sensitive string that represents the aggregation function and attributes is an optional list of attribute keys of type string to aggregate upon.

aggregate_on_attributes function removes all attributes that are present in datapoints except the ones that are specified in the attributes parameter. If attributes parameter is not set, all attributes are removed from datapoints. Afterwards all datapoints are aggregated depending on the attributes left (none or the ones present in the list).

NOTE: This function is supported only in metric context.

The following metric types can be aggregated:

  • sum
  • gauge
  • histogram
  • exponential histogram

Supported aggregation functions are:

  • sum
  • max
  • min
  • mean
  • median
  • count

NOTE: Only the sum aggregation function is supported for histogram and exponential histogram datatypes.

Examples:

  • aggregate_on_attributes("sum", ["attr1", "attr2"]) where metric.name == "system.memory.usage"
  • aggregate_on_attributes("max") where metric.name == "system.memory.usage"

The aggregate_on_attributes function can also be used in conjunction with keep_matching_keys or delete_matching_keys.

For example, to remove attribute keys matching a regex and aggregate the metrics on the remaining attributes, you can perform the following statement sequence:

statements:
   - delete_matching_keys(resource.attributes, "(?i).*myRegex.*") where metric.name == "system.memory.usage"
   - aggregate_on_attributes("sum") where metric.name == "system.memory.usage"

To aggregate only using a specified set of attributes, you can use keep_matching_keys.

aggregate_on_attribute_value

aggregate_on_attribute_value(function, attribute, values, newValue)

The aggregate_on_attribute_value function aggregates all datapoints in the metric containing the attribute attribute (type string) with one of the values present in the values parameter (list of strings) into a single datapoint where the attribute has the value newValue (type string). function is a case-sensitive string that represents the aggregation function.

NOTE: This function is supported only in metric context.

The following metric types can be aggregated:

  • sum
  • gauge
  • histogram
  • exponential histogram

Supported aggregation functions are:

  • sum
  • max
  • min
  • mean
  • median
  • count

NOTE: Only the sum agregation function is supported for histogram and exponential histogram datatypes.

Examples:

  • aggregate_on_attribute_value("sum", "attr1", ["val1", "val2"], "new_val") where metric.name == "system.memory.usage"

The aggregate_on_attribute_value function can also be used in conjunction with keep_matching_keys or delete_matching_keys.

For example, to remove attribute keys matching a regex and aggregate the metrics on the remaining attributes, you can perform the following statement sequence:

statements:
   - delete_matching_keys(resource.attributes, "(?i).*myRegex.*") where metric.name == "system.memory.usage"
   - aggregate_on_attribute_value("sum", "attr1", ["val1", "val2"], "new_val") where metric.name == "system.memory.usage"

To aggregate only using a specified set of attributes, you can use keep_matching_keys.

Examples

Perform transformation if field does not exist

Set attribute test to "pass" if the attribute test does not exist:

transform:
  error_mode: ignore
  trace_statements:
    # accessing a map with a key that does not exist will return nil. 
    - set(span.attributes["test"], "pass") where span.attributes["test"] == nil

Rename attribute

There are 2 ways to rename an attribute key:

You can either set a new attribute and delete the old:

transform:
  error_mode: ignore
  trace_statements:
    - set(resource.attributes["namespace"], resource.attributes["k8s.namespace.name"])
    - delete_key(resource.attributes, "k8s.namespace.name") 

Or you can update the key using regex:

transform:
  error_mode: ignore
  trace_statements:
    - replace_all_patterns(resource.attributes, "key", "k8s\\.namespace\\.name", "namespace")

Move field to attribute

Set attribute body to the value of the log body:

transform:
  error_mode: ignore
  log_statements:
    - set(log.attributes["body"], log.body)

Combine two attributes

Set attribute test to the value of attributes "foo" and "bar" combined.

transform:
  error_mode: ignore
  trace_statements:
    # Use Concat function to combine any number of string, separated by a delimiter.
    - set(resource.attributes["test"], Concat([resource.attributes["foo"], resource.attributes["bar"]], " "))

Parsing JSON logs

Given the following json body

{
  "name": "log",
  "attr1": "foo",
  "attr2": "bar",
  "nested": {
    "attr3": "example"
  }
}

add specific fields as attributes on the log:

transform:
  log_statements:
    - statements:
        # Parse body as JSON and merge the resulting map with the cache map, ignoring non-json bodies.
        # cache is a field exposed by OTTL that is a temporary storage place for complex operations.
        - merge_maps(log.cache, ParseJSON(log.body), "upsert") where IsMatch(log.body, "^\\{") 
          
        # Set attributes using the values merged into cache.
        # If the attribute doesn't exist in cache then nothing happens.
        - set(log.attributes["attr1"], log.cache["attr1"])
        - set(log.attributes["attr2"], log.cache["attr2"])
        
        # To access nested maps you can chain index ([]) operations.
        # If nested or attr3 do not exist in cache then nothing happens.
        - set(log.attributes["nested.attr3"], log.cache["nested"]["attr3"])

Override context statements error mode

transform:
  # default error mode applied to all context statements
  error_mode: propagate
  log_statements:
    # overrides the default error mode for these statements
    - error_mode: ignore
      statements:
        - merge_maps(log.cache, ParseJSON(log.body), "upsert") where IsMatch(log.body, "^\\{")
        - set(log.attributes["attr1"], log.cache["attr1"])

    # uses the default error mode
    - statements:
        - set(log.attributes["namespace"], log.attributes["k8s.namespace.name"])

Get Severity of an Unstructured Log Body

Given the following unstructured log body

[2023-09-22 07:38:22,570] INFO [Something]: some interesting log

You can find the severity using IsMatch:

transform:
  error_mode: ignore
  log_statements:
    - set(log.severity_number, SEVERITY_NUMBER_INFO) where IsString(log.body) and IsMatch(log.body, "\\sINFO\\s")
    - set(log.severity_number, SEVERITY_NUMBER_WARN) where IsString(log.body) and IsMatch(log.body, "\\sWARN\\s")
    - set(log.severity_number, SEVERITY_NUMBER_ERROR) where IsString(log.body) and IsMatch(log.body, "\\sERROR\\s")

Copy attributes matching regular expression to a separate location

If you want to move resource attributes, which keys are matching the regular expression pod_labels_.* to a new attribute location kubernetes.labels, use the following configuration:

transform:
  error_mode: ignore
  trace_statements:
    - statements:
        - set(resource.cache["attrs"], resource.attributes)
        - keep_matching_keys(resource.cache["attrs"], "pod_labels_.*")
        - set(resource.attributes["kubernetes.labels"], resource.cache["attrs"])

The configuration can be used also with delete_matching_keys() to copy the attributes that do not match the regular expression.

Troubleshooting

When using OTTL you can enable debug logging in the collector to print out useful information, such as the current Statement and the current TransformContext, to help you troubleshoot why a statement is not behaving as you expect. This feature is very verbose, but provides you an accurate view into how OTTL views the underlying data.

receivers:
  filelog:
    start_at: beginning
    include: [ test.log ]

processors:
  transform:
    error_mode: ignore
    log_statements:
      - set(resource.attributes["test"], "pass")
      - set(scope.attributes["test"], ["pass"])
      - set(log.attributes["test"], true)
          

exporters:
  debug:

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers:
        - filelog
      processors:
        - transform
      exporters:
        - debug
2025-02-13T13:01:07.590-0700    debug   ottl@v0.119.0/parser.go:356     initial TransformContext before executing StatementSequence     {"otelcol.component.id": "transform", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "logs", "otelcol.signal": "logs", "TransformContext": {"resource": {"attributes": {}, "dropped_attribute_count": 0}, "cache": {}}}
2025-02-13T13:01:07.591-0700    debug   ottl@v0.119.0/parser.go:35      TransformContext after statement execution      {"otelcol.component.id": "transform", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "logs", "otelcol.signal": "logs", "statement": "set(resource.attributes[\"test\"], \"pass\")", "condition matched": true, "TransformContext": {"resource": {"attributes": {"test": "pass"}, "dropped_attribute_count": 0}, "cache": {}}}
2025-02-13T13:01:07.593-0700    debug   ottl@v0.119.0/parser.go:356     initial TransformContext before executing StatementSequence     {"otelcol.component.id": "transform", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "logs", "otelcol.signal": "logs", "TransformContext": {"resource": {"attributes": {"test": "pass"}, "dropped_attribute_count": 0}, "scope": {"attributes": {}, "dropped_attribute_count": 0, "name": "", "version": ""}, "cache": {}}}
2025-02-13T13:01:07.594-0700    debug   ottl@v0.119.0/parser.go:35      TransformContext after statement execution      {"otelcol.component.id": "transform", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "logs", "otelcol.signal": "logs", "statement": "set(scope.attributes[\"test\"], [\"pass\"])", "condition matched": true, "TransformContext": {"resource": {"attributes": {"test": "pass"}, "dropped_attribute_count": 0}, "scope": {"attributes": {"test": ["pass"]}, "dropped_attribute_count": 0, "name": "", "version": ""}, "cache": {}}}
2025-02-13T13:01:07.594-0700    debug   ottl@v0.119.0/parser.go:356     initial TransformContext before executing StatementSequence     {"otelcol.component.id": "transform", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "logs", "otelcol.signal": "logs", "TransformContext": {"resource": {"attributes": {"test": "pass"}, "dropped_attribute_count": 0}, "scope": {"attributes": {"test": ["pass"]}, "dropped_attribute_count": 0, "name": "", "version": ""}, "log_record": {"attributes": {"log.file.name": "test.log"}, "body": "test", "dropped_attribute_count": 0, "flags": 0, "observed_time_unix_nano": 1739476867483160000, "severity_number": 0, "severity_text": "", "span_id": "0000000000000000", "time_unix_nano": 0, "trace_id": "00000000000000000000000000000000"}, "cache": {}}}
2025-02-13T13:01:07.594-0700    debug   ottl@v0.119.0/parser.go:35      TransformContext after statement execution      {"otelcol.component.id": "transform", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "logs", "otelcol.signal": "logs", "statement": "set(log.attributes[\"test\"], true)", "condition matched": true, "TransformContext": {"resource": {"attributes": {"test": "pass"}, "dropped_attribute_count": 0}, "scope": {"attributes": {"test": ["pass"]}, "dropped_attribute_count": 0, "name": "", "version": ""}, "log_record": {"attributes": {"log.file.name": "test.log", "test": true}, "body": "test", "dropped_attribute_count": 0, "flags": 0, "observed_time_unix_nano": 1739476867483160000, "severity_number": 0, "severity_text": "", "span_id": "0000000000000000", "time_unix_nano": 0, "trace_id": "00000000000000000000000000000000"}, "cache": {}}}
2025-02-13T13:01:07.594-0700    info    Logs    {"otelcol.component.id": "debug", "otelcol.component.kind": "Exporter", "otelcol.signal": "logs", "resource logs": 1, "log records": 1}

Contributing

See CONTRIBUTING.md.

Warnings

The Transform Processor uses the OpenTelemetry Transformation Language (OTTL) which allows users to modify all aspects of their telemetry. Some specific risks are listed below, but this is not an exhaustive list. In general, understand your data before using the Transform Processor.

  • Unsound Transformations: Several Metric-only functions allow you to transform one metric data type to another or create new metrics from an existing metrics. Transformations between metric data types are not defined in the metrics data model. These functions have the expectation that you understand the incoming data and know that it can be meaningfully converted to a new metric data type or can meaningfully be used to create new metrics.
    • Although the OTTL allows the set function to be used with metric.data_type, its implementation in the Transform Processor is NOOP. To modify a data type you must use a function specific to that purpose.
  • Identity Conflict: Transformation of metrics have the potential to affect the identity of a metric leading to an Identity Crisis. Be especially cautious when transforming metric name and when reducing/changing existing attributes. Adding new attributes is safe.
  • Orphaned Telemetry: The processor allows you to modify span_id, trace_id, and parent_span_id for traces and span_id, and trace_id logs. Modifying these fields could lead to orphaned spans or logs.

Feature Gate

transform.flatten.logs

The transform.flatten.logs feature gate enables the flatten_data configuration option (default false). With flatten_data: true, the processor provides each log record with a distinct copy of its resource and scope. Then, after applying all transformations, the log records are regrouped by resource and scope.

This option is useful when applying transformations which alter the resource or scope. e.g. set(resource.attributes["to"], log.attributes["from"]), which may otherwise result in unexpected behavior. Using this option typically incurs a performance penalty as the processor must compute many hashes and create copies of resource and scope information for every log record.

The feature is currently only available for log processing.

Example Usage

config.yaml:

transform:
  flatten_data: true
  log_statements:
    - set(resource.attributes["to"], log.attributes["from"])

Run collector: ./otelcol --config config.yaml --feature-gates=transform.flatten.logs