Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased memory usage when querying Prometheus datasources since 8.3.x #43369

Closed
radiohead opened this issue Dec 20, 2021 · 17 comments
Closed
Assignees
Labels
area/datasource/backend datasource/Prometheus effort/large prio/customer-support prio/high Must be staffed and worked on either currently, or very soon, ideally in time for the next release. prio/support-subscription type/bug

Comments

@radiohead
Copy link
Contributor

radiohead commented Dec 20, 2021


@bohandley update September 12, 2022
Description: Memory usage increased with Prometheus queries
Acceptance Criteria: Improve performance of Prometheus query memory usage by successfully implementing the streaming parser.
Status:
@toddtreece introduced the streaming parser to prometheus and began working on bring it to parity with the old prom client.
#49858
#50206
@aocenas helped our squad with a plan to bring the streaming to parity by comparing it with the old client.
#52738
@ismail is currently assigned the tasks to bring it to parity and remove the old client.
@toddtreece and @ryantxu have a plan to test the memory usage for Prometheus queries using real world testing as well as testing in staging and ops using conprof/parca (and now pyroscope?). This work is in progress and we are working to align everyone so that we can improve memory usage for Prometheus queries.


What happened:

When querying Prometheus datasources the memory usage of Grafana server has increased since Grafana 8.3.x when compared to 8.2.x.

Depending on the size of the result set, the memory usage has increased by 1.5x to 3x times, when comparing 8.3.3 to 8.2.7.

What you expected to happen:

Memory usage to not increase, or to not increase as sharply.

How to reproduce it (as minimally and precisely as possible):

  1. Launch a Prometheus instance
  2. Launch a 8.2.7 Grafana instance (instance A)
  3. Launch a 8.3.3 Grafana instance (instance B)
  4. Add scrape configs for both Grafana instances to your Prometheus instance
  5. Add Prometheus instance as datasource to both Grafana instances
  6. Query (e.g. in Explore) any metric (e.g. Go GC duration) on instance A a few times
  7. Query (e.g. in Explore) any metric (e.g. Go GC duration) on instance B a few times
  8. Query go_memstats_alloc_bytes for instance A and instance B
  9. You will see higher value for instance B

Anything else we need to know?:

The issue has been caused by the fact that Prometheus datasource has been refactored from a frontend datasource to a backend datasource and since 8.3 all queries have to be processed in Grafana server:

  • Grafana Frontend sends the request from the browser to the Grafana server
  • Grafana server calculates the necessary Prometheus query
  • Grafana server sends calculated query to Prometheus API
  • Grafana server receives and parses the response
  • Grafana server converts the response to DataFrames
  • Grafana server sends the DataFrames back to Grafana Frontend

Environment:

  • Grafana version: 8.3.x
  • Data source type & version: Prometheus
  • OS Grafana is installed on: Google Container-Optimised OS
  • User OS & Browser: MacOS 12.1 / Safari 15.2
  • Grafana plugins: Default only
  • Others: N/A
@radiohead
Copy link
Contributor Author

@gabor as discussed, here's the issue. Let me know if you need further information.

@simonc6372
Copy link
Contributor

In testing this, the memory usage seems scale linearly with the number of active sessions, so this could cause significant memory usage in some circumstances.

@gabor
Copy link
Contributor

gabor commented Jan 20, 2022

i did some measurements using a large prometheus JSON response (4MB).

  1. i ran a grafana docker image, and was monitoring it's memory usage (go_memstats_alloc_bytes) using the metrics it exports. i kept running a query that returned that large JSON response. while i was querying and re-querying, the memory used went up to roughly 50MB, but after i stopped doing it it went down below 20MB. so i don't think we have a memory leak here
  2. i measured how much memory the grafana-prometheus-datasource uses. to be exact, how much memory we use to handle the prometheus query, parse the returned JSON and create the grafana dataframes (that will be returned to the browser). i used the same JSON response as in [1]. i created a go benchmark for this and got the results with go test -benchmem -memprofile .... handling the request uses somewhere between 30MB and 65MB of memory, from that the code in the grafana codebase uses roughly 9MB, and the rest is used by the go prometheus library ( https://github.com/prometheus/client_golang ).

possible improvements to the codebase:

  1. we have an ongoing pull-request which could improve the performance, and lower the used memory by the grafana code from 9MB to 2MB at Prometheus: Matrix framing performance improvements #43766
  2. we could simply not use the prometheus go client library, and write completely custom code and go from JSON directly to grafana dataframes (currently we go from JSON to prometheus-client-lib-go-structures to grafana dataframes. this is a large change obivously.
  3. the same as [2], but we would try to do the JSON->dataframes transformation in a streaming fashion, to limit memory use. it's not clear if this is currently possible or not.
  4. we could implement a hard limit on the prometheus-json-response, and return an error if it is too large. in other words, when we receive the prometheus response, if it's length is more than for example 100KB, we return an error to the browser with "result too large".

@radiohead
Copy link
Contributor Author

@gabor I think ultimately we'd want something like [2] or [3], because it's the only possible solution to make memory usage bounded, without completely breaking large dataset results like in [4].

However, that would require us to refactor signification portion of the code, because AFAIK our current datasource API is not streaming-friendly.

Another thing that we could do short-term is to verify that our resolution calculation logic (the one that calculates the step parameter for range queries - https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries) and lower the resolution (i.e. increase step) for large range queries:

func calculatePrometheusInterval(model *QueryModel, dsInfo *DatasourceInfo, query backend.DataQuery, intervalCalculator intervalv2.Calculator) (time.Duration, error) {
queryInterval := model.Interval
//If we are using variable for interval/step, we will replace it with calculated interval
if isVariableInterval(queryInterval) {
queryInterval = ""
}
minInterval, err := intervalv2.GetIntervalFrom(dsInfo.TimeInterval, queryInterval, model.IntervalMS, 15*time.Second)
if err != nil {
return time.Duration(0), err
}
calculatedInterval := intervalCalculator.Calculate(query.TimeRange, minInterval, query.MaxDataPoints)
safeInterval := intervalCalculator.CalculateSafeInterval(query.TimeRange, int64(safeRes))
adjustedInterval := safeInterval.Value
if calculatedInterval.Value > safeInterval.Value {
adjustedInterval = calculatedInterval.Value
}
if model.Interval == varRateInterval || model.Interval == varRateIntervalAlt {
// Rate interval is final and is not affected by resolution
return calculateRateInterval(adjustedInterval, dsInfo.TimeInterval, intervalCalculator), nil
} else {
intervalFactor := model.IntervalFactor
if intervalFactor == 0 {
intervalFactor = 1
}
return time.Duration(int64(adjustedInterval) * intervalFactor), nil
}
}

I have a hunch that we might find some improvements there (i.e. make sure we that no matter the time range, we always return the same amount of time points). That way we could at least solve the issue for queries with too high of resolution.

@gabor
Copy link
Contributor

gabor commented Jan 26, 2022

@radiohead

i agree that [2] and [3] is a larger scale change.

about modifying the step.... currently the step is calculated based on the number_of_pixels_available_for_the_visualization (no point in getting more datapoints then available pixels on the screen), with some limits applied, we also make sure the step is big enough so that at most 11000 datapoints are returned for one time-series. we could easily change that 11000 limit to a lower value, but that is a backward-incompatible change in a sense.

also, sometimes the problem is the cardinality. for example, if the prometheus response return 300 separate time-series blocks, the response can be quite big, even if the number of data points for 1 time-series is smaller.

@radiohead
Copy link
Contributor Author

We also make sure the step is big enough so that at most 11000 datapoints are returned for one time-series.

Yeah, this sounds like a good first step to me.

We could easily change that 11000 limit to a lower value, but that is a backward-incompatible change in a sense.

How about making said limit configurable and set to 11000 by default? That way we could look into fine-tuning it and that will maintain backward compatibility.

Also, sometimes the problem is the cardinality. For example, if the prometheus response return 300 separate time-series blocks, the response can be quite big, even if the number of data points for 1 time-series is smaller.

Yup, I understand, but I don't see any low-hanging meaningful improvements that we could do here. At the very least having the ability to bound the dataset temporally is a good start.

Let me know if you'd like me to work on the changes to the datapoints limit.

@gabor
Copy link
Contributor

gabor commented Jan 27, 2022

@radiohead sorry, i probably wrote that in an ambiguous way about the 11000-limit.

the 11000-limit is currently in the code, it is live. this has been the behavior for a long time.

@gabor
Copy link
Contributor

gabor commented Jan 27, 2022

@radiohead hmm.. reading the discussion again, maybe there was no misunderstanding, sorry 😄

anyway, if you think making that limit configurable is worth the effort, please contact the @grafana/observability-metrics squad, they are currently responsible for the prometheus-data-source (i am moving more to Loki these days).

@marefr
Copy link
Member

marefr commented Feb 1, 2022

Not sure if this is an alternative/useful, but in case you're not aware you can configure a global response limit to limit the size of responses from outgoing HTTP requests.

@marefr
Copy link
Member

marefr commented Feb 1, 2022

Based on some discussions with @ryantxu created this discussion. Feel free to provide any feedback/thoughts/ideas there. Thanks

@toddtreece
Copy link
Member

configure a global response limit to limit the size of responses from outgoing HTTP requests.

@marefr does this apply to requests to external plugins as well?

@marefr
Copy link
Member

marefr commented Feb 3, 2022

@toddtreece no, we have this issue #39096 where the idea is to enforce a max limit on data frames rows.

@radiohead
Copy link
Contributor Author

Not sure if this is an alternative/useful, but in case you're not aware you can configure a global response limit to limit the size of responses from outgoing HTTP requests.

This would prevent instances from being OOMKilled, but unfortunately it doesn't solve the underlying problem of large query results not fitting in memory.

@evandandrea
Copy link

The Metrics squad is not currently working on this so we're moving to the backlog.

@evandandrea
Copy link

@bohandley will reach out to @toddtreece / @ryantxu to gather context / state on this issue.

@bohandley
Copy link
Contributor

My updated status is now at the top pf this issue. Once we safely and responsibly remove the old client this will help with memory usage. @toddtreece and @ryantxu put in a lot of work on this, @aocenas put in a lot of work and with the help of @obetomuniz and @itsmylife we have continued on this work. This is Q3 goal for Observability Metrics. Thank you!

@gtk-grafana gtk-grafana added effort/large prio/high Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Sep 26, 2022
@bohandley
Copy link
Contributor

I am happy to say that due to the hard work of @toddtreece, @itsmylife and many other people by implementing the streaming parser, the memory usage for the Prometheus datasource plugin has dropped significantly. See the following queries on go_memstats_alloc_bytes for two versions of Grafana, v9.0.0 which uses the buffered client and v9.3.1 which uses the streaming client. I followed the steps listed above and the top image with v9.0.0 hits about 50,000,000 while v9.3.1 in the bottom hits 17,500,000. I'm closing out this issue. Thanks all!
Screen Shot 2022-12-13 at 7 01 37 PM
Screen Shot 2022-12-13 at 7 02 16 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datasource/backend datasource/Prometheus effort/large prio/customer-support prio/high Must be staffed and worked on either currently, or very soon, ideally in time for the next release. prio/support-subscription type/bug
Projects
None yet
Development

No branches or pull requests

10 participants