Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization for spinning-disk servers. #533

Closed
beorn7 opened this Issue Feb 17, 2015 · 2 comments

Comments

Projects
None yet
1 participant
@beorn7
Copy link
Member

beorn7 commented Feb 17, 2015

On a server with conventional spinning disk and without a suitable RAID setup, Prometheus can only persist about 50 chunks/s (which is a bit less than initially expected, given that the average seek time is usually below 10ms, and the elevator algorithm should allow shorter-than-average seeks when working through the write cache...).Note: If you spin up a fresh server, the chunk write rate might be much higher initially, but that's because the file system handles small files in an optimized fashion. Once the individual series files grow to substantial size, the chunk persist rate drops to the above value.

Assuming ~100 samples per chunk, that limits ingestion speed to 5k samples/sec, which "ought to be enough for anybody". Or not... :)

Improved chunk encodings will increase the number of samples per chunk and thereby increase the ingestion rate.

A recent tweak also helps if the number of series is moderate or if a small number of series creates many chunks. With the tweak, more than one consecutive chunks will be persisted at a time if the persist queue backlogs too much. A test setup could prove that a 20k/s ingestion rate is feasible. Problem is that a backlogged persist queue needs to be drained for shutdown, which can take a long time. Not to speak about the data loss upon crashing.

The fundamental solution to that is to include non-persisted chunks in checkpointing. To hit two birds with one stone, the following design could be implemented:

  • Checkpoints include not only the head chunk, but all chunks not yet persisted.
  • Remove the persist queue entirely.
  • Chunk persistence happens in the same maintenance loop as chunk purging, i.e. rewriting the chunks already on disk (if necessary) and adding the not-yet-persisted chunks happens in one write operation, requiring only one seek.

Checkpoint sizes would grow if there is a backlog, but that's the price for using the memory to de-multiplex the chunk persistence.

Implementing this is not high priority, though, as it only serves the use case where all of the following applies:

  • High ingestion rate and/or bad compressability of the samples.
  • High number of series.
  • Spinning disk.
  • No suitable RAID setup.
@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Mar 19, 2015

Despite low priority, this is implemented now.

@beorn7 beorn7 closed this Mar 19, 2015

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.