Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upgetting null in response to api/v1/series #1698
Comments
This comment has been minimized.
This comment has been minimized.
|
Theory on how this can happen when the storage is corrupted: Callstack:
|
This comment has been minimized.
This comment has been minimized.
|
My storage fu currently doesn't go deep enough to immediately say on which level this should be filtered out. Whether |
beorn7
self-assigned this
Jun 2, 2016
This comment has been minimized.
This comment has been minimized.
|
I'll look into it once I find time. |
brian-brazil
added
the
kind/bug
label
Jun 9, 2016
This comment has been minimized.
This comment has been minimized.
|
With the changes that quarantine series that ran into an error, this should be solved. |
This comment has been minimized.
This comment has been minimized.
cwarden
commented
Feb 9, 2017
|
I think I'm seeing the same problem in 1.5.1
|
brian-brazil
added this to the v2.x milestone
Apr 6, 2017
fabxc
removed this from the v2.x milestone
Jul 3, 2017
brian-brazil
added
priority/P3
help wanted
component/ui
labels
Jul 14, 2017
beorn7
removed their assignment
Aug 8, 2017
This comment has been minimized.
This comment has been minimized.
|
I'm going to presume this is resolved with 2.0, as the implementation all changed. If not, please let us know. |
brian-brazil
closed this
Nov 17, 2017
This comment has been minimized.
This comment has been minimized.
|
thanks! will do. :) |
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
certainmagic commentedJun 1, 2016
first, let me say that prometheus is great. along with grafana, we've just started using it and it is really helping us tell what's going on in our servers. so thanks!
i noticed that my grafana dashboards were choking on some of my template variable queries. it looks like the problem is that the response contains nulls.
here's the query:
here's a snippet of the response with almost all of the data elements removed:
i had this problem on 0.18.0. (prometheus, version 0.18.0 (branch: release-0.18, revision: f12ebd6)
i also tried running with 0.19.2 just in case it had been fixed recently. the problem continues in 0.19.2.
I asked for pointers in this thread.
Björn asked me to move my data aside and run again. That fixed the problem. I think that both of the servers that had this problem had run out of space earlier. I've also found that we were killing them ungracefully 90 seconds after sending a TERM to shut them down -- and that checkpoints are currently taking about 2 minutes, so they weren't shutting down cleanly. I've seen it run a crash recovery after that. We are addressing the disk space issues and have extended the timeout to let it checkpoint cleanly on shutdown.
Björn suggested that there's a good chance it was corrupted indices.
Björn asked me to file a bug suggesting that prometheus respond better to corruption by not returning the nulls and logging an error.
Please let me know if there's any more information I can provide.
thanks again!
ab