Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(aggregators.final): Swap useful statements #15159

Merged
merged 1 commit into from Apr 18, 2024

Conversation

powersj
Copy link
Contributor

@powersj powersj commented Apr 15, 2024

Summary

The statements seem switched around currently and are causing confusion for users. Periodic is used to avoid gaps in the data, while timeout is used to only show the last value when a whole bunch come in, i.e. downsample.

https://community.influxdata.com/t/get-at-least-every-5-min-datapoint-by-mqtt-consumer/33911/8

Checklist

  • No AI generated code was used in this PR

@telegraf-tiger telegraf-tiger bot added docs Issues related to Telegraf documentation and configuration descriptions plugin/aggregator 1. Request for new aggregator plugins 2. Issues/PRs that are related to aggregator plugins labels Apr 15, 2024
@powersj powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Apr 15, 2024
Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the new wording is correct. Ontimeout all metrics received within the timeout are swallow and the aggregator only outputs the last metric if it is older than the timeout! So it "fills a gap" if no metric arrived within timeout.

For downsampling you want to get the data in a fixed interval (usually with a lower frequency than the input). This is exactly what the periodic strategy does, it outputs the last metric received at a fixed interval (the period). Here the "age" of the metric is irrelevant.

@Hipska
Copy link
Contributor

Hipska commented Apr 16, 2024

See this post for reference how we came to this conclusion. In the use-case of the TS, the current wording is very confusing.

@powersj
Copy link
Contributor Author

powersj commented Apr 16, 2024

So it "fills a gap" if no metric arrived within timeout.

This sentence is what is causing confusion. If no metrics are received within a timeout, then nothing is returned. i.e. if no metrics, a gap is produced. The text literally says to use "timeout" to fill gaps, but that is not the behavior.

For example, given the following config, and if I send a single metric:

[[aggregators.final]]
  period = "10s"
  delay = "5s"
  output_strategy = "timeout"
2024-04-16T13:53:00Z D! [aggregators.final] Updated aggregation range [2024-04-16 07:53:00 -0600 MDT, 2024-04-16 07:53:10 -0600 MDT]
metric,topic=telegraf/test value=43 1713275576116715552
2024-04-16T13:53:02Z D! [outputs.file] Wrote batch of 1 metrics in 51.7µs
2024-04-16T13:53:02Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-04-16T13:53:10Z D! [aggregators.final] Updated aggregation range [2024-04-16 07:53:10 -0600 MDT, 2024-04-16 07:53:20 -0600 MDT]
2024-04-16T13:53:12Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-04-16T13:53:20Z D! [aggregators.final] Updated aggregation range [2024-04-16 07:53:20 -0600 MDT, 2024-04-16 07:53:30 -0600 MDT]
2024-04-16T13:53:22Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-04-16T13:53:30Z D! [aggregators.final] Updated aggregation range [2024-04-16 07:53:30 -0600 MDT, 2024-04-16 07:53:40 -0600 MDT]
2024-04-16T13:53:32Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics

@srebhan
Copy link
Contributor

srebhan commented Apr 16, 2024

@powersj I agree that this can cause confusion, but the statement of this PR

  ##   timeout  -- output a metric if no new input arrived for `series_timeout`;
  ##               useful to down sample the input data
  ##   periodic -- output the last received metric every `period`;
  ##               useful for filling gaps in input data

is just wrong. You won't be able to downsample using timeout.

@powersj
Copy link
Contributor Author

powersj commented Apr 16, 2024

You won't be able to downsample using timeout.

Your definition above, is that downsampling is about periodic return of data, but I can also read downsample as taking many measurements and reducing them down to a single one. Reducing the rate at which data is transmited. If you receive 100 values during one period, and final returns only the last one, is that not a downsampling?

@srebhan
Copy link
Contributor

srebhan commented Apr 16, 2024

@powersj the definition of downsampling references a "rate" so my take is that is equidistant. But no matter if this is true or not, the timeout strategy will only output data on timeout So if you really get data at a regular interval, the aggregator will never output anything! That's not downsampling, is it?

@powersj
Copy link
Contributor Author

powersj commented Apr 16, 2024

the definition of downsampling references a "rate" so my take is that is equidistant. But no matter if this is true or not,

Unfortunately, I will disagree and state that this absolutely definition does matter.

In terms of telegraf, users are generally not doing signal processing. It is probably better to reference a definition specific to metrics and data, for example from our own blog post on downsamplling.

@powersj
Copy link
Contributor Author

powersj commented Apr 16, 2024

I want to specifically call out the following quote:

Additionally, I’ve seen many users ask: “How can I downsample my data to retrieve the mean value and value count from my data at every aggregation interval?”

Replace "mean" with "final".

Given that question from a user, what would you tell them? Use final with periodic or final with timeout?

I would argue that you could use either, and the key difference is if you want data to fill in gaps via periodic or not.

@srebhan
Copy link
Contributor

srebhan commented Apr 16, 2024

@powersj so assume a metric that really is coming at a constant rate of 1 sample per minute and use the timeout strategy. How would that ever downsample as the aggregator will never output any metric!?!?

Given that question from a user, what would you tell them? Use final with periodic or final with timeout?

You can ONLY use periodic to achieve that... The period is then the interval for the aggregation and the sampling period...

Copy link
Contributor

@DStrand1 DStrand1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@DStrand1 DStrand1 removed their assignment Apr 16, 2024
Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @powersj!

@srebhan srebhan merged commit fa0dbba into influxdata:master Apr 18, 2024
26 checks passed
@github-actions github-actions bot added this to the v1.30.2 milestone Apr 18, 2024
powersj added a commit that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issues related to Telegraf documentation and configuration descriptions plugin/aggregator 1. Request for new aggregator plugins 2. Issues/PRs that are related to aggregator plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants