Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After upgrading to 1.2.0: series missing for queries, depending on time interval supplied in group by time #7879

Closed
FrederikP opened this issue Jan 25, 2017 · 17 comments

Comments

@FrederikP
Copy link

@FrederikP FrederikP commented Jan 25, 2017

Bug report

After upgrading InfluxDB from 1.1.1 to 1.2.0 I noticed that some series where missing in my grafana graphs. For some reason these series reappeared when decreasing the browser window width (which leads to a bigger time interval supplied in the Group By time() statement that grafana generates.

I then looked deeper into the queries and executed the exact same query on the same data using InfluxDB 1.1.1 and 1.2.0 and compared the results. The data I operated on where application based cpu measurements (written every two minutes) which I grouped by app tag.

The query grafana (and I) performed was:

curl -G http://localhost:8086/query --data-urlencode "db=mydb" --data-urlencode "q=SELECT max("value") FROM "cpu_app" WHERE time > 1485323700000ms - 7d GROUP BY time(5m), "app" fill(null)" --data-urlencode "epoch=ms"

For an interval time of 5 minutes the result included 15 series (apps) for influxdb 1.1.1 and only 5 series for influxdb 1.2.0.

When using an interval time of 1 hour the results included 15 series for both versions, which is the expected behavior.
I don't understand why there is a difference between 1.1.1 and 1.2.0 for the 5 minute interval version and I really don't know what's happening here. It's hard to give any more useful information, because I don't know what other info I could supply.

I'll attach the results of the 4 queries, so you can see the difference between the versions.
influx-check.tar.gz

Please let me know what other info I can supply to help you solve this problem.

System info:

InfluxDB 1.2.0
Debian Jessie
.deb install

Steps to reproduce:

  1. write data to influx
  2. perform queries with different group by time intervals on 1.1.1 and 1.2.0 version of influx

Expected behavior:

The number of series is equal for 1.1.1. and 1.2.0 for all queries

Actual behavior:

The number of series differs depending on the interval size when compared to the complete queried time frame. I guess the relation between interval size and queried time frame is important here, but I don't know how exactly.

@FrederikP
Copy link
Author

@FrederikP FrederikP commented Jan 25, 2017

I took two screenshots in grafana to visualize the issue. I set a start- and end time and took a screen shot for 1.1.1 and then I only upgraded the db to 1.2.0 and refreshed the grafana graph (not the complete page only the data using grafanas update button), so it displays the exact same timeframe and thus uses the same query.

Screenshot for v1.1.1:
influx111

Screenshot for v1.2.0:
influx120

As you can see the 1.2.0 result is missing many series (for example the sshd one) just like described in the issue and shown in the raw results that I attached in the issue.

What I noticed is: The smaller I make the browser window (horizontally), the more series come up, meaning: the bigger the group by interval, the more series show up.

@pauldix pauldix added this to the 1.21 milestone Jan 25, 2017
@pauldix pauldix added this to the 1.2.1 milestone Jan 25, 2017
@pauldix pauldix removed this from the 1.21 milestone Jan 25, 2017
@jsternberg
Copy link
Contributor

@jsternberg jsternberg commented Jan 25, 2017

I think this may be caused by a bug I fixed at the same time as when I implemented subqueries. If you notice, the responses in the 1.2 version have "partial": true attached to them. There is one inside of the series which means that the series was split into a separate response and there is one on the result object itself meaning that the query response is telling the client that there will be another JSON object to follow.

Can you see if Grafana is using chunked=true when issuing a query? We have lagged behind in terms of making sure there's a Javascript client that can handle very large query returns. You can also just disable this feature by setting max-row-limit = 0 in the configuration file. Be aware that the reason why it is set to a default is to prevent the server from blowing up when the query is too large which is the reason why it's trying to send a partial response.

@jsternberg
Copy link
Contributor

@jsternberg jsternberg commented Jan 25, 2017

Also, removing chunked=true won't stop the problem. If chunked=false and the query exceeds max-row-limit, it just truncates the rest and the server response tells you the data is partial. This is to protect from people issuing queries that could OOM the server.

@FrederikP
Copy link
Author

@FrederikP FrederikP commented Jan 26, 2017

As far as I can tell grafana doesn't use chunked=true in queries. Also the response includes "partial": true, but no following request is performed.

After setting max-row-limit to 0 in my influx config, the graph behaves as expected and all data is shown. So this is not an influx issue, but a grafana issue that their team is maybe not aware exists.

Thanks for your response.

@petrslavotinek
Copy link

@petrslavotinek petrslavotinek commented Jan 26, 2017

I've also noticed this behavior. First in grafana but then also in InfluxDB admin interface - so it seems like InfluxDB issue.

I have queries, that I execute in admin interface all at once:
A = SELECT mean("value") FROM "sauter" WHERE "id" = '1880' AND time > now() - 60d GROUP BY time(1h);
B = SELECT mean("value") FROM "sauter" WHERE "id" = '1894' AND time > now() - 60d GROUP BY time(1h);
C =SELECT mean("value") FROM "sauter" WHERE "id" = '3640' AND time > now() - 60d GROUP BY time(500s);
D = SELECT mean("value") FROM "sauter" WHERE "id" = '1922' AND time > now() - 60d GROUP BY time(1000s);

The number of results returned by the server depends on their order.
When executed in order A;B;C;D (1h,1h,500s,1000s) the server returns results only for A, B, C => D is missing.
Order: A;B;D;C (1h,1h,1000s,500s) => returns A, B, D, C => ok.
Order: A;C;B;D (1h,500s,1h,1000s) => returns A, C => B, D missing.
Order: A;C;D;B (1h,500s,1000s,1h) => return A, C => B, D missing.
Order: A;D;B;C (1h,1000s,1h,500s) => returns A, D, B, C => ok.
Order A;D;C;B (1h,1000s,500s,1h) => returns A, D, C => B missing.
Order C;A;B;D (500s,1h,1h,1000s) => returns C => A, B, D missing.
Order C;A;D;B (500s,1h,1000s,1h) => returns C => A, B, D missing.
Order C;D;A;B (500s,1000s,1h,1h) => returns C => A, B, D missing.
Order D;A;B;C (1000s,1h,1h,500s) => returns D, A, B, C => ok.
Order D;A;C;B (1000s,1h,500s,1h) => returns D, A, C => B missing.
Order D;C;A;B (1000s,500s,1h,1h) => returns D, C => A, B missing.

C always seems to break it.

I've also tried some other combinations:
1h;1h;1000s;1000s => returns all.
1000s;1000s;1h;1h => returns first two.
1h;1000s;1h;1000s => returns all.
1000s;1h;1000s;1h => returns first three.

But
1s;1h;1s;1h => returns only first one.

everything for x = from 1s to 518s:
x;1h;x;1h => returns only first one.
everything for x = from 519s to 605s:
x;1h;x;1h => returns first two.
everything for x = from 606s to 1211s:
x;1h;x;1h => returns first three.
everything for x = from 1212s:
x;1h;x;1h => returns all.

I hope it's understandable :)

@jsternberg
Copy link
Contributor

@jsternberg jsternberg commented Jan 26, 2017

The InfluxDB admin interface has the same issue since it doesn't use chunking. At the moment, there isn't very good library support for that feature. It's a work in progress.

@kostko
Copy link

@kostko kostko commented Jan 29, 2017

It seems that chunking is not supported in the Python client anymore (influxdata/influxdb-python#318) as it was removed in 2015? Is there a reason why it was removed? Because with InfluxDB 1.2.0 it seems that chunking is required (if one doesn't want to change the default configuration), but many client libraries don't even support it?

@jsternberg
Copy link
Contributor

@jsternberg jsternberg commented Jan 31, 2017

We are going to continue trying to improve support in the client libraries for chunking so the out of box behavior is better in the future, but until then, please just set max-row-limit to zero to disable the row limit.

Sorry for the inconvenience.

@temenoskrpavithra
Copy link

@temenoskrpavithra temenoskrpavithra commented Feb 13, 2017

In Grafana, i find one of the series hidden. I had initially filed a bug in Grafana assuming that Grafana was hiding it.
After seeing the details of this issue, In influxdb.conf file, i have set max-row-limit = 0 .
Have restarted InfluxDB and Grafana. Still it makes no difference in Grafana.

I still find Partial:true in the query response for the series in problem in grafana end.

Am i missing additional configuration in InfluxDB ? How should i set chunk=false and chunk size in grafana/influxDB as a configuration setting.

@temenoskrpavithra
Copy link

@temenoskrpavithra temenoskrpavithra commented Feb 27, 2017

Does anyone experience the problem similar to mine ? Even after setting max-row-limit to zero, i see partial:true and a series hidden. I'm in the latest version of influxdb and grafana.

@jwilder
Copy link
Contributor

@jwilder jwilder commented Feb 27, 2017

@temenoskrpavithra please attach your config.

@temenoskrpavithra
Copy link

@temenoskrpavithra temenoskrpavithra commented Feb 27, 2017

Here is the config file that was generated post installation. This is a basic setup and we are in the initial stage of testing this with grafana.
Sampleinfluxconf.txt

@jwilder
Copy link
Contributor

@jwilder jwilder commented Feb 27, 2017

@temenoskrpavithra Your [http] section header is commented out. You need to uncomment it for the config settings to take affect.

Change:

# [http]

to

[http]

The were initially commented out, but we fixed that in cbb689e because it caused confusion for many people.

@temenoskrpavithra
Copy link

@temenoskrpavithra temenoskrpavithra commented Feb 27, 2017

Thank you for your reply. On enabling [http] i start getting authentication related errors now.

In influxdb.config, below is the setting

 # Determines whether HTTP endpoint is enabled.
 #  enabled = true
 # The bind address used by the HTTP service.
  bind-address = ":8086"
 # Determines whether HTTP authentication is enabled.
 #  auth-enabled = false
 # The default realm sent back when issuing a basic auth challenge.
 # realm = "InfluxDB"
 # Determines whether HTTP request logging is enable.d
 log-enabled = true
 # Determines whether detailed write logging is enabled.
 # write-tracing = false
 # Determines whether the pprof endpoint is enabled.  This endpoint is used for
 # troubleshooting and monitoring.
  pprof-enabled = true
 # Determines whether HTTPS is enabled.
 # https-enabled = false
 # The SSL certificate to use when HTTPS is enabled.
 # https-certificate = "/etc/ssl/influxdb.pem"
 # Use a separate private key location.
 # https-private-key = ""
 # The JWT auth shared secret to validate requests using JSON web tokens.
 # shared-sercret = ""
 # The default chunk size for result sets that should be chunked.
  max-row-limit = 0
 # The maximum number of HTTP connections that may be open at once.  New connections that
 # would exceed this limit are dropped.  Setting this value to 0 disables the limit.
  max-connection-limit = 0
# Enable http service over unix domain socket
 # unix-socket-enabled = false
 # The path of the unix domain socket.
 # bind-socket = "/var/run/influxdb.sock"

In influx editor, I'm able to view all DBs and query all DB data.
In grafana datasource, for some DBs, i get InfluxDB:Undefined error, I have tried with combination of access Direct/proxy and with basic authenticatiion and with credentials.

I have granted ALL privileges to the ID in the DB.

What is the recommended settting within http in the config file and what should be the authentication method from grafana end ?

@jwilder
Copy link
Contributor

@jwilder jwilder commented Feb 27, 2017

Since you had the [http] section header commented out. Your config entries under that section were not taking affect until now. Since you have auth-enabled = true in that section, your grafana datasource config will need to use a valid username and password.

@temenoskrpavithra
Copy link

@temenoskrpavithra temenoskrpavithra commented Feb 27, 2017

Yes, i agree to that. I have one admin user in InfluxDB and using that ID to connect from grafana end, ay, to couple of DBs in InfluxDB. They are connected successful and i see the charts.
With the same user credentials and similar settings, it gives '401 unauthorized' with other DBs.
I'm very sure that the right user ID and password are being used.

@temenoskrpavithra
Copy link

@temenoskrpavithra temenoskrpavithra commented Feb 27, 2017

I got it working. That was some cache related issue on the browser. I opened another browser and found that it works fine.

Thank you very much for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants