Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign up[Feature request] dropping chunk throttle #2384
Comments
beorn7
self-assigned this
Feb 1, 2017
beorn7
added
component/local storage
kind/enhancement
labels
Feb 1, 2017
This comment has been minimized.
This comment has been minimized.
|
That's a good point. While maintenance of in-memory series needs to be fast in rushed mode to persist more chunks, obviously all archived chunks are persisted already. |
This comment has been minimized.
This comment has been minimized.
|
With a server in effectively permanent rushed mode (which is going to be most large servers), if this is slowed down then you're going to run out of disk space. These iops do not contend with the persistence iops, as at most one of these functions can be running at a given time. |
This comment has been minimized.
This comment has been minimized.
|
Right. I thought for a while the maintenance of archived series would be sped up as well with increased persist urgency. But I did the right thing from the beginning, see https://github.com/prometheus/prometheus/blob/v1.5.0/storage/local/storage.go#L1238 So yeah, I think leaving it in the normal even throttling is the right thing to do. In rushed mode, the maintenance frequency of archived series will be much lower than the one for memory series, so that's the right trade off. |
beorn7
closed this
Feb 1, 2017
This comment has been minimized.
This comment has been minimized.
|
Actually, we have a problem to drop too old chunks from our production Prometheus server. We use Prometheus more than 1 year. But, it is too difficult to dropping chunks. To accomplish our goal, I consider to optimize dropping chunk code.
I'm not familiar with Prometheus code, so I might misunderstand something. |
This comment has been minimized.
This comment has been minimized.
|
In general, the Prometheus server is not optimized for sudden decreases of the retention time (and I'm reluctant to do so if that increases the code complexity significantly or if it makes the regular use case (i.e. a constant or slightly changing retention time) worse). About your suggestions:
|
This comment has been minimized.
This comment has been minimized.
|
Of course, I don't want to add complexity or bad effect to Prometheus, too.
I'll try to make a PR for this.
I'm sorry, I miss the point. And I understand backwards read (and random read?) is not good for HDD. I reconsider how I can drop too old chunks effectively. In "deleting series" API, we can take more specific optimization method. |
This comment has been minimized.
This comment has been minimized.
|
I found the same issue what I met. |
This comment has been minimized.
This comment has been minimized.
|
#1740 has been commented on already. Obviously, truncating series files at the right place requires IO. If you reduce the retention time, the work has to be done. If your server was heavily loaded already, it might need to throttle ingestion to do so. |
mtanda
referenced this issue
Feb 5, 2017
Merged
storage: optimize dropping chunks by using minShrinkRatio #2397
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
mtanda commentedFeb 1, 2017
Current implementation doesn't consider
prometheus_local_storage_persistence_urgency_score.https://github.com/prometheus/prometheus/blob/v1.5.0/storage/local/storage.go#L1210-L1250
I think it should consider
prometheus_local_storage_persistence_urgency_scoreto prevent, dropping chunk activity cause too many disk iops, and finally cause rush mode.