Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upprometheus crash case #2293
Comments
This comment has been minimized.
This comment has been minimized.
|
Can someone tell me how to deal with crash? |
This comment has been minimized.
This comment has been minimized.
|
time="2016-12-21T15:51:26+08:00" level=info msg="Starting prometheus (version=1.2.3, branch=master, revision=c1eee5b0da2540b9dfd2f70752015b0fce83b616)" source="main.go:75" runtime stack: goroutine 38137 [running]: goroutine 1 [select, 15 minutes]: goroutine 12 [syscall, 15 minutes]: goroutine 115 [select, 15 minutes]: goroutine 136 [select, 15 minutes]: goroutine 135 [select]: goroutine 137 [select]: goroutine 116 [select]: goroutine 117 [select, 15 minutes]: goroutine 118 [select, 15 minutes]: goroutine 119 [select]: goroutine 120 [select, 15 minutes]: goroutine 121 [select]: goroutine 122 [select, 15 minutes]: goroutine 123 [select, 15 minutes]: goroutine 124 [select]: goroutine 125 [select, 15 minutes]: goroutine 126 [select]: goroutine 127 [select, 15 minutes]: goroutine 128 [select, 15 minutes]: goroutine 145 [select]: goroutine 146 [select, 15 minutes]: goroutine 147 [select]: goroutine 148 [select, 15 minutes]: goroutine 149 [select, 15 minutes]: goroutine 150 [select]: goroutine 46 [select]: goroutine 47 [select, 15 minutes]: goroutine 48 [select, 15 minutes]: goroutine 161 [select]: goroutine 159 [select]: goroutine 177 [select]: goroutine 178 [select]: goroutine 166 [select, 15 minutes]: goroutine 141 [select, 15 minutes]: goroutine 168 [semacquire, 15 minutes]: goroutine 169 [IO wait]: goroutine 138 [select, 15 minutes]: goroutine 151 [select, 15 minutes]: goroutine 38181 [runnable, locked to thread]: goroutine 289 [select]: goroutine 171 [chan send]: goroutine 292 [IO wait]: goroutine 158 [select]: goroutine 157 [select]: goroutine 291 [select]: goroutine 296 [select]: goroutine 294 [select]: goroutine 293 [select]: goroutine 290 [select]: goroutine 295 [select]: goroutine 223 [select]: goroutine 224 [select]: goroutine 38196 [chan send]: goroutine 160 [select]: goroutine 273 [select]: goroutine 274 [select]: goroutine 275 [select]: goroutine 276 [select]: goroutine 277 [select]: goroutine 278 [select]: goroutine 297 [select]: goroutine 298 [select]: goroutine 38175 [select]: goroutine 38226 [IO wait]: goroutine 38168 [IO wait]: goroutine 38219 [IO wait]: goroutine 38133 [select]: goroutine 38171 [runnable]: goroutine 38228 [IO wait]: goroutine 38176 [IO wait]: goroutine 38230 [chan send]: goroutine 38182 [runnable]: goroutine 38136 [IO wait]: goroutine 38140 [IO wait]: goroutine 38174 [IO wait]: goroutine 38220 [select]: goroutine 38227 [select]: goroutine 38225 [select]: goroutine 38134 [select]: goroutine 38229 [select]: |
This comment has been minimized.
This comment has been minimized.
|
-storage.local.memory-chunks 1G |
This comment has been minimized.
This comment has been minimized.
|
should I increase -storage.local.memory-chunks to 3g? |
This comment has been minimized.
This comment has been minimized.
|
when I increase to 3G ,log is: goroutine 163 [running]: |
This comment has been minimized.
This comment has been minimized.
|
Your process / machine is running out of memory in the first crash ( In the second one, Prometheus tries to allocate a Go slice that's larger than possible, not sure why exactly, but maybe it has something to do with very high flag values you provide. Can you give the actual flags that you start Prometheus with? That is, is |
This comment has been minimized.
This comment has been minimized.
|
yes . my machine has 16GB, when I start Prometheus with -storage.local.memory-chunks 1GB |
This comment has been minimized.
This comment has been minimized.
|
default config is 1MB ,it occasionally occur OOM |
This comment has been minimized.
This comment has been minimized.
|
now 40551 series loaded, 11 machine scrape with 3s,11mahchine with 10s.
is 8000000000000 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Sorry, but the numbers you mention mismatch or don't really make sense. -storage.local.memory-chunks does not take a GB (bytes) value, but a number of chunks. The maximum it should be set to is a couple of million (certainly not billions or more!). On a machine with only 16GB of RAM, it should probably not be set to more than 2 million or something like that. "is 8000000000000" -> what do you mean with that? Certainly no flag value should be set to 8 trillion. If that is the value for -storage.local.memory-chunks, then Prometheus will try to keep 8 trillion chunks in memory before it evicts them. It is no wonder why you are getting out-of-memory crashes then. Try a value in the lower millions, or even less. If you really only have 40551 series, you probably want to set this to only a couple of hundred thousand. |
This comment has been minimized.
This comment has been minimized.
|
so ,How to avoid out-of-memory ? |
jinhang
closed this
Dec 22, 2016
This comment has been minimized.
This comment has been minimized.
timchenxiaoyu
commented
Jul 19, 2017
|
my prometheus is crash picture show crush twice in 12PM ADN 8 AM ul 19 08:12:38 slave-95 kernel: INFO: task prometheus:20402 blocked for more than 120 seconds. |
This comment has been minimized.
This comment has been minimized.
timchenxiaoyu
commented
Jul 19, 2017
•
|
my target is not change ,but ,why write io is continuously increase? |
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |


jinhang commentedDec 21, 2016
•
edited