-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a new spine that does compaction in a thread. #1863
Conversation
gz
commented
Jun 11, 2024
- The idea is to have one thread per spine. Maybe later we can relax this to n background compactor thread and load-balance work across multiple spines (this is e.g., what RocksDB does)
- The thread just does the compaction on behalf of the spine and sends the results back. For now it will just follow a simple FIFO order and compact one by one. In future improvements we can add n-way merges, and round-robin or other strategies to merge multiple things at the same time.
- Having compaction in a different thread meant slightly changing the way that the backend meta-data about read-only files is kept. Before it was strictly per-thread and stored by the thread that creates them. Now they need to be shareable among compactor and threads that hold spines so this meta-data lives in a concurrent hashmap.
- There was also some issue with not dropping the lockfile on the storage dir in tests. This was resolved by moving the StorageLocation to the RuntimeHandle.
- Had to so some small changes to tests due to asserts that were too strong assumed behavior of the old spine but werent necessary.
9f9efaa
to
7892010
Compare
Benchmark resultsNexmark
Galen
|
The comment from above shows that it seems to improve performance for DRAM. I'll add/run the results with storage enabled too. |
I'm reading the code this morning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the code and I like how much simpler it is than the fueled spine. I think that the assertion failure we're having in the fueled spine code is going to be hard to confidently fix.
This is going to create a lot of threads in cases where we have a lot of spines. I suspect we'll have to do something about that, but this is a fine place to start.
Thank you!
@gz any idea why the peak RSS seems to be very high for async spine? Is it accumulating work faster than it is able to process? |
I changed the amount of background threads to the number of workers instead of #spines.
|
by popular demand, with min_storage_row set to 10_000 instead of 0 (previous tables) -async spine
unfortunately the spine fueled fails with |
356c12f
to
333e287
Compare
ran q9 with 50M events (min_storage_rows=100_000)..
|
- The idea is to have one background thread per worker to handle spine merges in the background. Maybe later the amount of background workers is made configurable. - The thread just does the compaction on behalf of the spine and sends the results back. For now it will just follow a simple FIFO order and compact one by one. In future improvements we can add n-way merges, and round-robin or other strategies to merge multiple things at the same time. - Having compaction in a different thread meant slightly changing the way that the backend meta-data about read-only files is kept. Before it was strictly per-thread and stored by the thread that creates them. Now they need to be shareable among compactor and threads that hold spines so this meta-data lives in a concurrent hashmap. - There was also some issue with not dropping the lockfile on the storage dir in tests. This was resolved by moving the StorageLocation to the RuntimeHandle. - Had to so some small changes to tests due to asserts that were too strong assumed behavior of the old spine but werent necessary. - I changed the concurrency for the buffer cache (again), now it is shared and protected with a Mutex, but the sharing is limited and just between the worker and it's background thread. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>