New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InfluxDB: take advantage of influx chunked reponses #863
Comments
have you considered using influxdb continuous queries to pre aggregate and speed up the queries? I havent used influxdb in a production setup yet, but with graphite (in production usage with hundreds of thousands of metrics) I have yet to find queries that take more than a second even for long time ranges and many metrics. Also are you using columns to distinguish metrics or are you using a series per metric (with single a single value column). Because I think that using a where clause and using columns like "host" is very bad for performance when using influxdb. As to streaming in the results, it would definitely be possible. But more would have to +1 vote it for me to spend time on it (might be 1-2 days work at least). |
I get this too ... Maybe I should look into continuous queries but I get this issue with single value columns. |
Sorry I never replied to this earlier, apparently it was lost in a sea of open browser tabs.... Anyway, cont- queries would, I guess, work fine if I knew beforehand what I wanted to query. However most of the time I spend in grafana is exploring perf. problems, so I don't know what is interesting before I start exploring. Having said that, a 90% solution to this is just to use a long enough group_by period, by which you can drastically reduce the amount of data send back from influx (cursory investigation suggests it's influx's speed is inversely proportional to the raw amount of data that it needs to pass back). The other benefit of doing this is that grafana needs to store less in memory so the ui remains snappy. |
Support for this would clear up #2266 which the influxdb team has identified as the root issue influxdata/influxdb#3242 |
I have a related question - could Grafana be configured to retrieve data by chunks, using new request per chunk? The problem is that I have a lot of metrics in a blob in DB (write-optimized, storage size optimized, reads are infrequent and quite expensive) and a dashboard where for each metric Grafana creates an HTTP request for a big period of time (say, 1 day or 1 week). A natural thing would be to try to aggregate these requests to process all metrics simultaneously, as ultimately they point to the very same blob - but modern browsers allow only about 6 concurrent connections to the server by default. Caching blob decompression and parsing results would resolve the problem, but for long periods of data (say, 1 week) it is no longer an option, as data for the first 6 requests has to be processed first before follow-up requests for other metrics will be sent => I need to cache the whole 1 week of per-second data for all possible metrics. If Grafana would be able to perform data retrieval in configurable chunks (say, retrieve no more than 1 hour of data with 1 request), and for all metrics ask for the first hour first, then, when completed, ask for the second hour and so on... It would resolve the problem. Are there some configuration options like this? |
With the new streaming infrastructure, this is now something we can consider (it is still a ways off!) but there is a path for it. The things we need are:
However, I a bit skeptical that the browser will be able to do anything useful if there is too much data. |
Hi everyone! What is the status on supporting chunked responses? I am working on developing a plugin for a streaming data source whose responses are chunked. |
The datasource query function can return a rxjs Observable to stream results |
@sknaumov i think the use-case you described in #863 (comment) is slightly different from what is requested in this issue, could you please open a separate feature-request for discussing it? thanks! |
@sanga i'm trying to understand your use-case better:
|
NOTE: in general, i wonder how much is this use-case still supported in influxdb2. i did some tests with influxdb 2.0.7:
|
NOTE: for flux-mode, we are consuming the flux-response csv-row by csv-row ( https://github.com/grafana/grafana/blob/main/pkg/tsdb/influxdb/flux/executor.go#L78 ), so at least in theory there is a way to return non-full data to the browser. still, the question remains how useful this would be. |
@gabor to be honest this ticket is so old that I can't anymore recall precisely what my problem was. I think is was a combination of both. Which is to say: Slowness caused by a large amount of data. I agree with your assertion that having a large amount of data will probably cause the browser to be unusably slow in any case. Given that and the fact that chunked queries don't exist anymore, I think this ticket can be closed |
@sanga thanks for the info, closing it then. |
Fair warning up front. This is possibly not that trivial and I not even entirely sure as to the feasibility of doing this but anyway...
Influx supports chunked http responses, so that it will send data back in, well, chunks, as it calculates the data. According to the docs, it should send all data for the time period it has calculated and then move on to the next chunk of time from the requested time period. So I'm wondering, might it be possible to read the data in in chunks and paint the graph chunk by chunk (a very tertiary glance at flot docs would appear to suggest this is possible at least)?
The problem I'm trying to get around is basically this. I have some graphs that plot an awful lot of data. And they take a long time to paint. During that time we currently just get a spinner in the graph. Much nicer would be if the graph would gradually "fill in" i.e. paint backwards in time (like Splunk if you ever happen to have used that tool).
What do you think? Reasonable use of time/complexity to implement?
The text was updated successfully, but these errors were encountered: