Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upPoller crash - "morestack on g0" #4650
Comments
This comment has been minimized.
This comment has been minimized.
|
Hmm, this is weird... One possibility is a data race in the code so it would be good if you could run the binary compiled with |
This comment has been minimized.
This comment has been minimized.
|
Sure - would that solve anything, or create more informative logs for the next crash? I can run a |
This comment has been minimized.
This comment has been minimized.
|
Additional information as it would log the data race if any is detected. The counter part is that the program will be slower and use more memory. |
This comment has been minimized.
This comment has been minimized.
|
Having some trouble with
In Makefile.common build target:
Doesn't seem to propagate CGO_ENABLED=1 to the go build:
|
This comment has been minimized.
This comment has been minimized.
|
You need to add this to go:
cgo: true
... |
This comment has been minimized.
This comment has been minimized.
|
Thanks, that worked. I'll report back with the new logs from the race binary (when it happens again). |
This comment has been minimized.
This comment has been minimized.
|
Unfortunately the performance hit was rather big. I'll need a better way to run this, rather than on my main 2 HA pollers - maybe a third poller. |
This comment has been minimized.
This comment has been minimized.
|
The poller crashes after a long startup when
|
This comment has been minimized.
This comment has been minimized.
|
This looks like running with race detection requires too much memory on your system so no luck. Can you run |
This comment has been minimized.
This comment has been minimized.
|
@sevagh ping |
This comment has been minimized.
This comment has been minimized.
|
Oops, so what happened is, I was running the poller 2.3.2 (where promtool doesn't have debug). I upgraded to 2.4.2. At the same time, somebody reduced the amount of metrics by around 12x (there was an application bug where many scrape targets were exposing duplicate metrics). I haven't seen any crashes since. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for the follow-up! I'm closing it now, feel free to reopen if it crashes again but the system was probably unstable because of the too many metrics. |
sevagh commentedSep 24, 2018
Bug Report
What did you do?
Running the Prometheus poller which crashed overnight. It seems to crash regularly (this poller in particular has more scrape targets than my other pollers, it's in a higher-traffic DC).
Some grafana graphs on the performance characteristics:
Environment