-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus Remote Write Parser Should Match InfluxDB Remote Write Parsing #9038
Comments
Hey @disfluxly, Thanks for opening this feature request and your insight/interest around prometheus remote write parsing. We are interested in researching different potential approaches for handling prometheus metrics which would cover the various plugins we have for prometheus and prometheus remote write. If you'd be interested in providing us with your insight on what would work best. And/or sharing prometheus data that you have from production this would be really useful for us while we undertake a research spike on this topic. You are correct that the way Telegraf approaches prometheus metrics is a different from the one approach outlined in the InfluxDB 1.8 docs. The way the remote write parser currently works is inline with our prometheus parser The measurement name is the plugin name e.g. prometheus in the case of the prometheus parser, the field key is the metric name with the sample value, with labels turned into tags. In the case of the remote write parser, the measurement name is prometheus_remote_write, with the field key as the metric name with the sample value as a float64. All labels are converted into tags. Grouping it under one measurement name which is the plugins name is a typical pattern in Telegraf. That said the measurement name or field names can be changed to be the metric name from prometheus using the processors.rename plugin. Any more insight/feedback you can give would be really helpful as we look to make improvements in our prometheus offerings. |
@helenosheaa - Hello. I can't go into too much detail, and I wouldn't be able to provide any kind of production data. At a high level, I'm using prometheus within EKS in order to gather K8S and Pod metrics. I'm centralizing all metrics back to a single account, and telegraf has long been my point of entry into the centralized ingestion pipeline. I currently have a setup for influx line protocol, where I send other metrics to an http v2 listener in ILP. Essentially, I'd like to add a second http v2 listener on a separate path that accepts the prometheus remote write endpoints. All of this integration already exists so its significantly less overhead than having to spin up a second telegraf service and have yet another integration. The problem with using the processor is that it'll impact the existing setup (correct?). I'd need to do some tag-based routing so that the processor doesn't run against any metrics that come in via the ILP listener. This isn't a huge issue, it's just not ideal.
To me this seems like an anti-pattern, as it necessitates taking further action in order to essentially ETL the metrics post input. I'd much rather retain as much of the data integrity as is possible and only take action against the data for edge cases. |
Also curious if you have an example of how I could use the rename processor to dynamically set the measurement name == field name? The configurations look very static. |
@disfluxly I understand no problem, thanks for explaining. We are trying to gather a list of prometheus metric names but can look to other sources in the community. Yes, you'd need to do some either tag based routing or In terms of renaming dynamically you'd be better placed to utilize the Starlark processor. Here is also a blog about using Starlark to write custom scripts which may be helpful. For a dynamic setup between measurement name and fieldname you would be best placed to use a Starlark script. Something along the lines of:
Which would rename the measurement name as the fieldname and change the fieldname to value as you proposed in your feature request. Or you can customize it differently for your use case. Once you've defined your script you hand the path in to the Starlark plugin:
|
Closing as now there is a Starlark script for this use case #9074 |
Feature Request
The Prometheus Remote Writer Parser added in #8682 should parse metrics in the same way that the Prometheus Remote Write InfluxDB endpoint does, as defined here:
https://docs.influxdata.com/influxdb/v1.8/supported_protocols/prometheus/#how-prometheus-metrics-are-parsed-in-influxdb
Proposal:
When parsing the metrics, each field should become the measurement name. The only field in the metric should be "value", and should always be a float. This matches what InfluxDB would do.
Current behavior:
Currently, all metrics are output to a single measurement:
prometheusremotewrite
, as defined here: https://github.com/influxdata/telegraf/blob/master/plugins/parsers/prometheusremotewrite/parser.go#L58Desired behavior:
Rather than outputting everything into the same measurement (which will cause a massive set of field/tag keys), metrics should be output into a measurement that matches the field name. We should loop through all metrics, setting the measurement to the name of the field, and then setting the only field to "value", converting the field's value to a float. Text fields would be dropped.
Use case:
In its current state, this is only useable for extremely small clusters or testing purposes. The set of tag/field keys for the
prometheusremotewrite
measurement will quickly balloon for anything past testing purposes. This also breaks from the expectation when using Influx's prometheus remote write integration, which could be confusing for users that would like to use Telegraf as a relay to Influx.This would also more closely preserve data model integrity for outputs that are sending to prometheus.
The text was updated successfully, but these errors were encountered: