Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InfluxDB measurements dropdown in QueryBuilder poor performance #65344

Closed
lelithium opened this issue Mar 27, 2023 · 3 comments
Closed

InfluxDB measurements dropdown in QueryBuilder poor performance #65344

lelithium opened this issue Mar 27, 2023 · 3 comments
Assignees
Labels
datasource/InfluxDB needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating prio/medium Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/needs-confirmation used for OSS triage rotation - reported issue needs to be reproduced

Comments

@lelithium
Copy link

What happened:
When using the InfluxDB Query builder, both in the Explore view and when building a dashboard panel, the measurement selector (FROM <retention policy>.<measurement>...) takes a long time to complete for moderate amounts of measurements.

More context:

  • We use Influx Cloud V2 (in TSM mode), with a DBRP mapping to enable InfluxQL access through Grafana.
  • This seems to scale with the number of measurements. 50+ measurements is about 40 seconds, on another test run with 14 measurements the wait time is about 15 seconds.
  • We setup our InfluxDB datasource as follows:
    • Data Source: InfluxDB
    • Query Language: InfluxQL
    • URL: https://{public InfluxCloud V2 endpoint}:443
    • Timeout: 600 # Necessary or the measurement dropdown times out
    • Custom HTTP Headers:
      • Header: Authorization / Value: "Token [...]"
    • Database:
    • HTTP Method: POST
  • This only happens for the measurement dropdown. The retention policy, field and tag (group by) dropdown selectors all work fine once a measurement has been selected.
  • This also doesn't happen when writing raw queries (i.e. show measurements in the raw query editor returns quickly), which is why I believe the issue to be on Grafana's side instead of InfluxCloud
  • This doesn't happen on InfluxCloud V1
  • This also doesn't happen on a local test setup, with a non-Cloud InfluxDB V2.6, which makes it hard to replicate.

What you expected to happen:
The measurement dropdown menu should be displayed in a similar time that it takes to run show measurements on the database

How to reproduce it (as minimally and precisely as possible):
I was able to replicate the connection issue on a dockerized Grafana running locally, outside of our production setup, but only to an InfluxCloud V2 bucket + DBRP setup as described above. Here are minimal replication instructions nonetheless

  1. On https://cloud2.influxdata.com/
  2. Create a new bucket (named test henceforth)
  3. Create a new API token with full access (or RW to the test bucket + All Access DBRP. This is only to replicate in testing, so I haven't scoped out ideal permissions)
  4. Use the Influx V2 CLI to create a DBRP mapping to this bucket (influx v1 dbrp create --db testdbrp --rp 30d --bucket-id <test bucket ID> --host https://<your public endpoint> --token '<your token>')
  5. Launch a fresh Grafana v9.4.7 container (grafana/grafana:9.4.7), and add a new InfluxDB datasource as detailed above
  6. Explore the datasource, confirm the "measurement" dropdown displays almost immediately
  7. Add measurements. Here's a quick bash script to create 70 measurements value1, ..., value70
     for i in $(seq 1 70);do
       curl -i -XPOST 'https://<your endpoint>/write?db=testdbrp' --header 'Authorization: Token <your token>' --data-binary "value$i,host=test,region=test value=$i.$i"
     done
  8. Explore the datasource again, notice that the measurement dropdown now takes about 40 seconds to complete.

Anything else we need to know?:
I did go through the logs in DEBUG mode, and couldn't find any difference between a quick dropdown display on a local InfluxDB V2 database and its very long counterpart on the InfluxCloud database.

Environment:

  • Grafana version: Tested on 9.3.0 and 9.4.7
  • Data source type & version: InfluxDB V2
  • OS Grafana is installed on: grafana/grafana:9.3.0 and grafana/grafana:9.4.7
  • User OS & Browser: Happens at least on Windows+Edge and Linux+Chrome
  • Grafana plugins: None
  • Others:

Thanks for reading through, and please reach out if there's any additional information I can provide to help in diagnosing the issue !

@zuchka zuchka added datasource/InfluxDB triage/needs-confirmation used for OSS triage rotation - reported issue needs to be reproduced labels Mar 28, 2023
@itsmylife itsmylife self-assigned this Jul 5, 2023
@itsmylife itsmylife added prio/medium Important over the long term, but may not be staffed and/or may need multiple releases to complete. needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating labels Jul 5, 2023
@itsmylife
Copy link
Contributor

@lelithium Thanks for the detailed issue report. That is quite clear and well-written. Appreciated.
I tried it myself and cannot experience the same poor performance. My instance is on eu-central-1-1.aws.
I believe it might be related to the region. But this is also weird because, with raw query editor, it just works you say.
So I have questions:

  • Do you have the feature flag influxdbBackendMigration enabled?
  • Is there a possibility that I can try it with your instance?
  • In the network tab you can see what query we send to influxdb when you click the measurements dropdown. Could you please try to run that exact query on raw query editor and share how fast it is?

@lelithium
Copy link
Author

Hi @itsmylife

I just tried to replicate this, and it seems to be working fine now, even on instances that I know I saw the issue occur on before. Maybe this was on InfluxData's side ?

In any case, thanks for following up on this ! I'll reopen with more details if this occurs again

@itsmylife
Copy link
Contributor

@lelithium Thanks for the feedback. I am happy that the issue was resolved even though I did nothing. So I believe (now strongly) the problem was on InfluxData's side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource/InfluxDB needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating prio/medium Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/needs-confirmation used for OSS triage rotation - reported issue needs to be reproduced
Projects
None yet
Development

No branches or pull requests

3 participants