Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downsampling data #36

Open
Feudoor opened this issue May 27, 2019 · 2 comments

Comments

@Feudoor
Copy link

commented May 27, 2019

Is there ability to downsample data?

For example, I need to store raw metrics for 1 month, and metrics aggregated by 30 minutes for 1 year.

@valyala valyala added the question label May 27, 2019

@valyala

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

VictoriaMetrics doesn't provide automatic downsampling at the moment. But it may be implemented using the following approach:

  • To run multiple VictoriaMetrics instances (or clusters) with distinct retentions, since each VictoriaMetrics instance works with a single retention.
  • To periodically scrape the required downsampled data via /federate API from the instance with raw data and store it in the instance with higher retention.

We are planning to add recording rules to VictoriaMetrics with the ability to export the recorded data into external storage such as another VictoriaMetrics instance with higher retention.

side notes

Downsampling is usually used for two purposes:

  • reducing query time over long time ranges
  • reducing the required storage size

VictoriaMetrics is optimized for both cases

So VictoriaMetrics work quite good without the downsampling. An additional benefit is that you can drill down old data to small time ranges without precision loss.

@valyala valyala added the enhancement label May 27, 2019

@thulle

This comment has been minimized.

Copy link

commented Jun 9, 2019

I'm commenting part as a +1 on this being a requested feature, and part answering questions posted on reddit by @valyala about reasons to use VM.

To be able to downsample data like RRD in combination with the current storageefficiency of large amounts of timeseries, and being able to query this downsampled data transparently is imho. the featureset required for VM to become the obvious choice for devops/dashboards.
By transparently I mean that older data should be returned in lower resolution, without the querier having to make anything different. This would allow debugging issues using high resolution data as they happen and view trends over long time just by changing timerange in a dashboard, instead of having to modify queries & datasources in existing/pre-made dashboards or storing huge amounts of data.

@tenmozes tenmozes referenced this issue Jul 27, 2019
0 of 5 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.