-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement for upper/lower bound #5059
Comments
This is a good suggestion. Thanks. |
I plan to try the idea this month. Will update. |
@yiwu-arbug Another thing I noticed is we're actually doing two comparisons for each key that comes from an SST: here ( rocksdb/table/block_based_table_reader.cc Lines 2505 to 2511 in 8a1ecd1
Lines 454 to 457 in ca89ac2
|
I did a quick prototype and number of key comparison is reduced by 1/3 for long range scan, when I apply both of (a) avoid recheck upper-bound when it is greater than block's largest key, and (b) avoid checking upper-bound at the point @ajkr points out. I also find that rocksdb is not updating perf_context.user_key_comparison_count correctly for range scan. I'll work on a PR and share more concrete benchmark result soon. |
Patch to fix perf_context.user_key_comparison_count: #5098 |
Fixing duplicated per-key upper bound check: #5101 |
Whether this generates net saving depends on how likely the one in table reader can actually invalidate the iterator. For example, in your primary key point-lookup operation, it's more likely that the upper bound check will filter out most SST files from putting into the iterator heap. Any iterator put into the iterator heap will generate more than one comparisons. I'm sure that in the case you mentioned, the long scan, you'll see smaller number of comparisons. But how about put it together and run your whole system? Also, number of comparisons is not right counter to measure gain. CPU is a better one. When you have this optimization applied to your whole system, will you see CPU saving? |
The actual improvement and db_bench result: #5111 |
@siying Appreciate your comment! For background, the optimization targets for queries like "select count(*)" for relatively small tables in our system (so that the table can still be cached in memory, but the scan is long enough). Sorry I didn't provide enough test result, as it take me a while to get them. But they are in #5111. db_bench do show CPU savings for particular workload. Sure for whole system the net saving would be small, but it is not insignificant, for the the change is relatively simple. |
How about a compromise: we check the upper bound in block based table after Seek(), and not after Next()? |
Do you mind explain a little bit more? |
@yiwu-arbug I mean we do per key check inside block based table only in Seek(), and remove it for Next(). In this way relatively longer range scan will be fast, and the cases where most levels can be filtered out by per key filtering can also be fast. |
The proposed change doesn't touch the Seek() flow at all. Its capability to filter out levels remain the same. |
Blocking on fixing issue with format_version>=3 (see #5101 (comment)). Fix of the issue is blocking by #5139, which by itself is a major issue. edited: #5139 is non-issue. |
Retrying PR5101: #5142 |
Current behavior when scan a range with lower & upper bound is set:
Fetch a key from memtable/sst/block, compare the key with lower/upper bound, return invalid if the key exceed bound. So there is one comparing for each key.
Improvement suggestion:
When a sst is totally covered by lower & upper bound, all keys returned from this sst don't need to compare with bound.
When a block is totally covered by lower & upper bound, all keys returned from this block don't need to compare with bound.
This can improve scan speed.
@yiwu-arbug
The text was updated successfully, but these errors were encountered: