Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Idea:
We know that the prefixes normally look like this:
[0, 0, 0, 0, 0, 0, 0, 21] length = 8
the keys we check them against look mostly like this:
[0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 12] length = 12
Right now we check the prefixes via iterating from the beginning, which means we have to iterate over several zero values which could be improved if we start from the back. Right now this will only improve the case when the prefix is different. This was what I first tried.
Later I thought in order to improve this for all cases, I could check the column family byte directly (currently we have less than 128 column families), this means we can just check the last byte of the long (we write them in ByteOrder.BIG_ENDIAN). If this is equal the entry is a valid entry for the CF otherwise not.
Furthermore, sometimes we don't provide any prefix this means we can check whether the length of the prefix is zero so we don't check the key further. If the prefix key is larger then zero we can check the key (starting after the CF prefix).
The JMH benchmark results were interesting since they show a much lower error rate (variance) than the other test runs.
I want to start a benchmark by maxing out performance to see whether we see any difference.
Related issues
related #12241
Definition of Done
Not all items need to be done depending on the issue and the pull request.
Code changes:
backport stable/1.3
) to the PR, in case that fails you need to create backports manually.Testing:
Documentation:
Other teams:
If the change impacts another team an issue has been created for this team, explaining what they need to do to support this change.
Please refer to our review guidelines.