-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CQ shared message store improvements #6090
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
I don't usually push commits with just my thoughts about the changes I am currently doing but when I do I do it in style and break compilation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After all this is done the PR will be in a good enough state to merge.
fffe717
to
3868da6
Compare
226ea78
to
fac8c5d
Compare
20d2b70
to
e8e8986
Compare
This commit replaces file combining with single-file compaction where data is moved near the beginning of the file before updating the index entries. The file is then truncated when all existing readers are gone. This allows removing the lock that existed before and enables reading multiple messages at once from the shared files. This also helps us avoid many ets operations and simplify the code greatly. This commit still has some issues: reading a single message is currently slow due to the removal of FHC in the client code. This will be resolved by implementing read buffering in a similar way as FHC but without keeping files open more than necessary. The dirty recovery code also likely has a number of issues because of the compaction changes.
We no longer use FHC there and don't keep FDs open after reading.
This allows simplifying a bunch of things.
The cache used to help keep binary references around for fan-out cases, was introduced in 2009 and removed in 2011. It's no longer relevant...
So in 2009 it was written that combining helped performance. But I doubt we ever get to a scenario where it matters to reduce the number of file_summary table entries today. There's been plenty of ets table optimisations, and this branch reduces the number of fields in entries anyway and we don't go over the whole table as often as before. See 30bc61f
The first is not going to be super useful. The second is not possible because we already have a check on the file_summary table.
Instead of doing a complicated +1/-1 we do an update_counter of an integer value using 2^n values. We always know exactly in which state we are when looking at the ets table. We also can avoid some ets operations as a result although the performance improvements are minimal.
Let's see if this helps performance of single reads.
Also use defines.
This is ready for review/merge into The main performance improvement comes when reading long queues that have many messages in the store: the queue will now perform a read of multiple messages at once, just like in the CQv2 embedded store. This is all done from within the queue process itself as well. In order to make this possible the store's compaction mechanism had to be changed so that it would never overwrite data that may be accessed by a queue. Now instead of combining two files together (and deleting the old data) the store will compact a single file and move data at the end of the file to the beginning of the file where there are holes (messages were removed). Truncation happens later on when we know there are no queue reading from the file (we keep track of when the queue accesses the file to know this). We also avoid hard locks in the process. I have also reworked the flying message mechanism in hopes it will allow us further optimisations, but I don't think I will be able to do those until converting the message store to no longer use gen_server2. So this feels like a good time to stop and merge what was done (early to get a lot of testing) and have further work done in the future building upon this. |
The only regression I found is queue deletion time. Deleting a queue with 1M 5kb messages (no other queues in the system) takes 2s on
On |
I repeated the test with 15 queues. That's a hopefully very much a corner case territory putting 15M messages in the message store (and deleting them) but I was interested how these numbers change with scale.
I'm not sure how much time we want to spend on this issue but let's have a chat about this before we merge (or we can merge and then perhaps improve upon that) |
We don't need it and it slows down queue deletion far too much.
With 59259b2, the results are now similar to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I observe similar improvements with messages of 8192 bytes in size
with a backlog of messages across N CQs (CQv2 specifically) and
32 fast consumers.
Great to see that this PR lines add fewer lines that it removes.
Now that 3.12.0 has shipped, we can merge this. |
Great results for consuming long queues.