Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Remote Write Parser Should Match InfluxDB Remote Write Parsing #9038

Closed
disfluxly opened this issue Mar 23, 2021 · 5 comments
Closed
Assignees
Labels
area/prometheus feature request Requests for new plugin and for new features to existing plugins

Comments

@disfluxly
Copy link

disfluxly commented Mar 23, 2021

Feature Request

The Prometheus Remote Writer Parser added in #8682 should parse metrics in the same way that the Prometheus Remote Write InfluxDB endpoint does, as defined here:

https://docs.influxdata.com/influxdb/v1.8/supported_protocols/prometheus/#how-prometheus-metrics-are-parsed-in-influxdb

Proposal:

When parsing the metrics, each field should become the measurement name. The only field in the metric should be "value", and should always be a float. This matches what InfluxDB would do.

Current behavior:

Currently, all metrics are output to a single measurement: prometheusremotewrite, as defined here: https://github.com/influxdata/telegraf/blob/master/plugins/parsers/prometheusremotewrite/parser.go#L58

Desired behavior:

Rather than outputting everything into the same measurement (which will cause a massive set of field/tag keys), metrics should be output into a measurement that matches the field name. We should loop through all metrics, setting the measurement to the name of the field, and then setting the only field to "value", converting the field's value to a float. Text fields would be dropped.

Use case:

In its current state, this is only useable for extremely small clusters or testing purposes. The set of tag/field keys for the prometheusremotewrite measurement will quickly balloon for anything past testing purposes. This also breaks from the expectation when using Influx's prometheus remote write integration, which could be confusing for users that would like to use Telegraf as a relay to Influx.

This would also more closely preserve data model integrity for outputs that are sending to prometheus.

@disfluxly disfluxly added the feature request Requests for new plugin and for new features to existing plugins label Mar 23, 2021
@helenosheaa
Copy link
Member

Hey @disfluxly,

Thanks for opening this feature request and your insight/interest around prometheus remote write parsing.

We are interested in researching different potential approaches for handling prometheus metrics which would cover the various plugins we have for prometheus and prometheus remote write. If you'd be interested in providing us with your insight on what would work best. And/or sharing prometheus data that you have from production this would be really useful for us while we undertake a research spike on this topic.

You are correct that the way Telegraf approaches prometheus metrics is a different from the one approach outlined in the InfluxDB 1.8 docs.

The way the remote write parser currently works is inline with our prometheus parser metric version = 2(which is what Telegraf currently recommends when using our prometheus plugins).

The measurement name is the plugin name e.g. prometheus in the case of the prometheus parser, the field key is the metric name with the sample value, with labels turned into tags.

In the case of the remote write parser, the measurement name is prometheus_remote_write, with the field key as the metric name with the sample value as a float64. All labels are converted into tags.

Grouping it under one measurement name which is the plugins name is a typical pattern in Telegraf. That said the measurement name or field names can be changed to be the metric name from prometheus using the processors.rename plugin.

Any more insight/feedback you can give would be really helpful as we look to make improvements in our prometheus offerings.

@helenosheaa helenosheaa self-assigned this Mar 25, 2021
@disfluxly
Copy link
Author

@helenosheaa - Hello. I can't go into too much detail, and I wouldn't be able to provide any kind of production data. At a high level, I'm using prometheus within EKS in order to gather K8S and Pod metrics. I'm centralizing all metrics back to a single account, and telegraf has long been my point of entry into the centralized ingestion pipeline.

I currently have a setup for influx line protocol, where I send other metrics to an http v2 listener in ILP. Essentially, I'd like to add a second http v2 listener on a separate path that accepts the prometheus remote write endpoints. All of this integration already exists so its significantly less overhead than having to spin up a second telegraf service and have yet another integration.

The problem with using the processor is that it'll impact the existing setup (correct?). I'd need to do some tag-based routing so that the processor doesn't run against any metrics that come in via the ILP listener. This isn't a huge issue, it's just not ideal.

Grouping it under one measurement name which is the plugins name is a typical pattern in Telegraf

To me this seems like an anti-pattern, as it necessitates taking further action in order to essentially ETL the metrics post input. I'd much rather retain as much of the data integrity as is possible and only take action against the data for edge cases.

@disfluxly
Copy link
Author

Also curious if you have an example of how I could use the rename processor to dynamically set the measurement name == field name? The configurations look very static.

@helenosheaa
Copy link
Member

@disfluxly I understand no problem, thanks for explaining. We are trying to gather a list of prometheus metric names but can look to other sources in the community.

Yes, you'd need to do some either tag based routing or namepass = ["prometheus_remote_write"] to direct metrics only from prometheus remote write.

In terms of renaming dynamically you'd be better placed to utilize the Starlark processor. Here is also a blog about using Starlark to write custom scripts which may be helpful.

For a dynamic setup between measurement name and fieldname you would be best placed to use a Starlark script. Something along the lines of:

 def apply(metric):
   for k, v in metric.fields.items():
       metric.name = k
       metric.fields["value"] = v
       metric.fields.pop(k)
   return metric

Which would rename the measurement name as the fieldname and change the fieldname to value as you proposed in your feature request. Or you can customize it differently for your use case.

Once you've defined your script you hand the path in to the Starlark plugin:

[[processors.starlark]]
  script = "/usr/local/bin/myscript.star"

@helenosheaa
Copy link
Member

Closing as now there is a Starlark script for this use case #9074

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/prometheus feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

No branches or pull requests

2 participants