Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries failing after 60s despite changing dataproxy.timeout to > 60s(changed from 30/default but 180 is not reflecting) for Influx datasource #29457

Closed
som-kanade opened this issue Nov 29, 2020 · 13 comments · Fixed by #34597 or #81252
Assignees
Labels
datasource/InfluxDB needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating prio/low It's a good idea, but not scheduled for any release

Comments

@som-kanade
Copy link

What happened:
Long running queries from Influx Datasource failed to fetch for larger data within 60s but dataproxy.timeout is changed to 180s (which is not reflecting)

What you expected to happen:
We expect that queries should not appear to time out in less time than has been configured in the dataproxy.timeout setting.

Environment:

  • Grafana version: 7.X.X
  • Data source type & version: InfluxDB
  • OS Grafana is installed on: Ubuntu
  • User OS & Browser: Chrome
  • Grafana plugins:
  • Others:
@som-kanade som-kanade changed the title Queries failing after 30s despite changing dataproxy.timeout to > 30s for Influx datasource Queries failing after 30s despite changing dataproxy.timeout to > 60s(changed from 30/default but 180 is not reflecting) for Influx datasource Nov 29, 2020
@aknuds1 aknuds1 added datasource/InfluxDB needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating labels Nov 29, 2020
@som-kanade som-kanade changed the title Queries failing after 30s despite changing dataproxy.timeout to > 60s(changed from 30/default but 180 is not reflecting) for Influx datasource Queries failing after 60s despite changing dataproxy.timeout to > 60s(changed from 30/default but 180 is not reflecting) for Influx datasource Nov 29, 2020
@som-kanade
Copy link
Author

any updates here ?

@mdicss
Copy link

mdicss commented Mar 3, 2021

I have the same problem.

@gabor gabor assigned gabor and unassigned ryantxu May 14, 2021
@gabor
Copy link
Contributor

gabor commented May 17, 2021

there is an issue with dataproxy.timeout (#34177), it might cause this problem, so we'll have to wait until that one is fixed.

@gabor gabor removed the needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating label May 17, 2021
@gabor
Copy link
Contributor

gabor commented May 25, 2021

@som-kanade @mdicss i tried this in both grafana 7.5.7 and in current main-branch, and the connection is not cut after 60 seconds when the timeout is set to a large enough value.

is it possible that some network equipment between the influxdb database and grafana-server is dropping the connection after 60 seconds?

if not, could you please provide more detailed info? :

  • version of grafana
  • version of influxdb
  • is grafana and influxdb running on the same computer? natively or in docker?
  • are you using the InfluxQL query language or the Flux query language?
  • how exactly is the fetch dropped at 60 seconds? what error message do you see? is it always exactly 60 seconds?

thanks.

@gabor gabor added needs more info Issue needs more information, like query results, dashboard or panel json, grafana version etc and removed type/bug labels May 25, 2021
@som-kanade
Copy link
Author

Hey @gabor
Grafana version > 7.0.3
Influx version > 1.7.9
is grafana and influxdb running on the same computer? natively or in docker? > No they are running on different EC2 boxes
are you using the InfluxQL query language or the Flux query language? > influx query
how exactly is the fetch dropped at 60 seconds? what error message do you see? is it always exactly 60 seconds? > yes most of the times it times aafter 60s

thanks

@marefr marefr added type/bug area/datasource/proxy and removed needs more info Issue needs more information, like query results, dashboard or panel json, grafana version etc labels May 25, 2021
@marefr marefr added this to the 8.0.0-beta3 milestone May 25, 2021
@marefr
Copy link
Member

marefr commented May 25, 2021

Fixed by #34597

@marefr marefr closed this as completed May 25, 2021
@gabor
Copy link
Contributor

gabor commented May 26, 2021

hi @som-kanade thanks for the info. please try this out with grafana8 beta 3 or later (when it is released, it is not released yet). there were changes to the timeout handling that could fix the problem. if it does not we can reopen the issue.

also, moving to a newer influxdb version is worth a try, the "1.x" series is at 1.8.6 currently.

another approach is to, for a test-case, move both grafana and influxdb to the same EC2 box and try it again, something between the boxes might cut the connection after 60seconds.

@rodolk
Copy link

rodolk commented Dec 6, 2022

Hello @gabor and all. This issue is still there.
I have grafana 8.4.6 and the datasource is type influxdb with Flux query language (I don't know if Flux affects).
The problem is that Grafana, is closing the connection at 60 seconds. I verified this with a packet capture.
I set this configuration for dataproxy:
timeout: 600
keep_alive_seconds: 15
idle_conn_timeout_seconds: 300

Also read_timeout is set:
read_timeout: 0

Now if I set the datasource (HTTP) timeout configuration to any number greater than 60sec it will still use 60sec max.
If I set it to 25 sec, it will timeout at 25 sec.
Probably there is something in the datasource connector code that is not allowing any value greater than 60 sec.

The error might be in the use of influxdb datasource code. Is this the repo https://github.com/grafana/influxdb-flux-datasource?

Looking at file https://github.com/grafana/influxdb-flux-datasource/blob/master/pkg/models/settings.go
It seems tha in function LoadSettings it is using DefaultOptions
model.Options = influxdb2.DefaultOptions()

This function LoadSettings is called from function newDataSourceInstance in file pkg/influx/datasource.go
This function is calling influxdb2.newClientWithOptions and is not setting a different SetHttpRequestTimeout

newDataSourceInstance function returns a grafana-plugin-sdk-go/backend/instancemgmt

What I don't understand is why it works when I use a timeout value lower than 60. Maybe the problem is saving the options from Grafana to the influxdb2 datasource?

I see this issue from 2020 that might be related: influxdata/influxdb-client-go#94

@marefr
Copy link
Member

marefr commented Dec 6, 2022

@rodolk flux support has been part of Grafana since v7 or something. Make sure you're using the builtin Influx datasource rather than any external plugin. The code in question for the builtin flux support is here

opts := influxdb2.DefaultOptions()
opts.HTTPOptions().SetHTTPClient(dsInfo.HTTPClient)
return &runner{
client: influxdb2.NewClientWithOptions(url, dsInfo.Token, opts),
org: org,
}, nil

@rodolk
Copy link

rodolk commented Dec 6, 2022

@marefr OK, I assume I'm using the builtin influx datasource. How do I check this?
Would this line set the timeout among other parameters?

opts.HTTPOptions().SetHTTPClient(dsInfo.HTTPClient)

@marefr
Copy link
Member

marefr commented Dec 7, 2022

No, doesn't look like it. It's weird influx client provide a separate API for this and not reusing the existing http.Client Timeout field - then it would have just worked as expected. Maybe a newer version of the influx client handles this better?

@gabor
Copy link
Contributor

gabor commented Dec 9, 2022

@marefr it's strange, this is an old issue, but originally i did test it and larger-than-60-second timeouts worked fine. it was not possible to reproduce the problem. anyway, if the timeout-value is not propagated, then this is something we should look at (i'm still confused why it worked in the past). i'll reopen the issue, but i don't work in this area anymore, so pinging @grafana/observability-metrics 👍

@gabor gabor reopened this Dec 9, 2022
@gabor gabor removed this from the 8.0.0-beta3 milestone Dec 9, 2022
@gabor gabor removed their assignment Dec 9, 2022
@itsmylife itsmylife added needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating prio/low It's a good idea, but not scheduled for any release labels Mar 21, 2023
@itsmylife itsmylife self-assigned this Mar 21, 2023
@lars20070
Copy link

I face the same problem. I have connected Grafana to an InfluxDB data source.

I have extended the config /etc/grafana/grafana.ini as below and restarted the grafana-server service.

[dataproxy]
timeout = 1200

Yet if I run large queries in Grafana (which I expect to take several minutes) I get the timeout error below.

Post “[http://localhost:8086/api/v2/query?org=main-org”:](http://localhost:8086/api/v2/query?org=main-org%E2%80%9D:) net/http: timeout awaiting response headers (Client.Timeout exceeded while awaiting headers)

It seems the client i.e. Grafana runs into a timeout. But my config changes have no effect.

Grafana version: 10.2.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource/InfluxDB needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating prio/low It's a good idea, but not scheduled for any release
Projects
None yet
10 participants