Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Added warnings sections #11052

Merged
merged 10 commits into from
Jun 27, 2022
14 changes: 13 additions & 1 deletion processor/transformprocessor/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
# Transform Processor

| Status | |
| ------------------------ | --------------------- |
|--------------------------|-----------------------|
| Stability | [alpha] |
| Supported pipeline types | traces, metrics, logs |
| Distributions | [contrib] |
| Warnings | [Many](#warnings) |
Copy link
Member

@mx-psi mx-psi Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the name "caveats" more, although I am not a native English speaker. "Limitations" also sounds like a reasonable name for some of the ones we discussed at the SIG, but maybe it doesn't apply accurately to all of the issues discussed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was purposely trying to avoid the word "caveats" because I thought it might not be good for non-native English speakers. I can switch to Caveats if everything agrees it's ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! Well 'caveats' is understandable to me in that sense, but I don't want to be the representative of non-native speakers 😄

mx-psi marked this conversation as resolved.
Show resolved Hide resolved

The transform processor modifies telemetry based on configuration using the [Telemetry Query Language](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/processing.md#telemetry-query-language).
It takes a list of queries which are performed in the order specified in the config.
Expand Down Expand Up @@ -135,5 +136,16 @@ All logs
<!-- markdown-link-check-disable-next-line -->
See [CONTRIBUTING.md](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/CONTRIBUTING.md).


## Warnings

The transform processor's implementation of the [Telemetry Query Language](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/processing.md#telemetry-query-language) (TQL) allows users to modify all aspects of their telemetry. Some specific risks are listed below, but this is not an exhaustive list. In general, understand your data before using the transform processor.

- Transformation of metrics have the potential to affect the [identity of a metric](https://github.com/open-telemetry/opentelemetry-specification/blob/main//specification/metrics/data-model.md#opentelemetry-protocol-data-model-producer-recommendations) leading to an Identity Crisis. Be especially cautious when transforming metric name and when reducing/changing existing attributes. Adding new attributes is safe.
- Several Metric-only functions allow you to transform one metric data type to another or create new metrics from an existing metrics. Transformations between metric data types are not defined in the [metrics data model](https://github.com/open-telemetry/opentelemetry-specification/blob/main//specification/metrics/data-model.md). These functions have the expectation that you understand the incoming data and know that it can be meaningfully converted to a new metric data type or can meaningfully be used to create new metrics.
- Although the TQL allows the `set` function to be used with `metric.data_type`, its implementation in the transform processor is NOOP. To modify a data type you must use a function specific to that purpose.
- The processor allows you to modify `span_id`, `trace_id`, and `parent_span_id` for traces and `span_id`, and `trace_id` logs. Modifying these fields could lead to orphaned spans or logs.
- The `limit` function drops attributes at random. If there are attributes that should never be dropped then this function should not be used. [#9734](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/9734)

[alpha]: https://github.com/open-telemetry/opentelemetry-collector#alpha
[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib