Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upfatal error: concurrent map read and map write (when using delete series) #4154
Comments
brian-brazil
added
kind/bug
component/local storage
labels
May 11, 2018
This comment has been minimized.
This comment has been minimized.
As an aside, using delete this often is really not recommended. |
This comment has been minimized.
This comment has been minimized.
|
Any guideline on the timing as not to incur this error? every 30min? |
This comment has been minimized.
This comment has been minimized.
|
This is a bug, and should be fixed. If you are running into this with a cron of 5mins, you'll likely run into this for 30mins too. |
This comment has been minimized.
This comment has been minimized.
|
ahh... ic ic. okay i will wait for the bug fix. Thank You |
This comment has been minimized.
This comment has been minimized.
codwu
commented
May 14, 2018
|
this can be fixed by adding some lines to prometheus/tsdb/querier.go, like
|
This comment has been minimized.
This comment has been minimized.
|
@codwu thanks! , can you open a PR and I will review it. |
codwu
referenced this issue
May 14, 2018
Merged
add rwmutex to prevent concurrent map read when delete series #330
This comment has been minimized.
This comment has been minimized.
|
When testing the bug fix, During the first time when prometheus show error "context deadline exceeded", i did a curl and was able to get the metrics from the target. prometheus and target are installed on the same server. Below is the log:
|
This comment has been minimized.
This comment has been minimized.
codwu
commented
Jun 5, 2018
|
It seems that the error "context deadline exceeded" is cause by sending alert to AlertManager, not related to this issue. |
This comment has been minimized.
This comment has been minimized.
|
the only unusual log here is
One other thing to be careful about is running more than one replica. If you mount the data folder from the host and run more than one replica at any given time this will cause 2 or more Prometheus instances to write to the same data folder which leads to many problems. Are there no other logs related to the crash?
|
fabxc
closed this
in
prometheus/tsdb#330
Jul 11, 2018
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
games130 commentedMay 11, 2018
What did you do?
I run cronjob to delete metric data every 5min
curl -XPOST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]=heplify_caller&end=xxx'xxx is current time minus 1 hour.
I am doing this because all my other data is stored for 6months but this particular metric, i only need latest 1 hour. So I will delete anything older than 1 hour.
What did you expect to see?
metrics older than 1 hour are deleted every 5min
What did you see instead? Under which circumstances?
The deletion of the metric works, but prometheus crash frequently. Attached below is the log
I tested by removing the cronjob and it runs without crashing.
Environment
Linux 3.10.0-693.11.1.el7.x86_64 x86_64
I am running docker for the prometheus
Prometheus version:
version=2.2.1
Prometheus configuration file: