-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FLINK-11050 add lowerBound and upperBound for optimizing RocksDBMapState's entries #7226
Conversation
Hi @Myracle, thanks for the PR. I think we should either support the new API for all What do you think about this @StefanRRichter? Best, Fabian |
Thank you for your reply. The lowerBound and upperBound are used by RocksDB's interface to avoid wasting time on deleted values. The situation is not the same when storing state in heap. Do you still think that we should support the new API for all |
I think this suggestion is problematic for the following reasons:
Overall that leads me to the conclusion that we cannot rush to add this optimization but need a bit more careful thinking, e.g. introducing a subclass OrderedMapState (openly or hidden and cast where optimization is required). Even in that case we need to be careful when addressing the problem of different orders ( |
Thank you @StefanRRichter for your reply. The items you mentioned are all right and I learn a lot from them. But consider the great performance after my optimization, I think that this will help more persons. My code is already used in our company's online platform and it is stable for a long time. We just use IntervalJoin to deal our data. Following your suggestions and for code-simple, I think it's better to add filter(lowerBound, upperBound) rather than entries(lowerBound, upperBound) in MapState in semantics, although their function is the same. Also, we add isIntervalJoinSeekOptimization in config to open the optimization. This optimization is not open to users. Currently, only rocksDB's filter is supported. Because rocksDB's storage is different than others and only rocksDB supports large state . Also, rocksDB supports lowerBound and upperBound interfaces for users to optimize seek. Consider the specific implement for rocksDB, we only open the optimization by config parameter. For comparable, we will give a note in the comment to warn developers of key-types. Anyone who wants to use this function must guarantee the comparison for the key. As for my case in the intervalJoinOperator, the key is timestamp and it is comparable in byte-lexicographical. For upperBound, rocksDB does't support setIterateUpperBound in ReadOptions until the rocksdbjni version 5.9.2. Flink rocksdbjni's version is 5.7.5. Above is my thought. The code is modified too. Thank you. |
1cd7460
to
d966332
Compare
@Myracle I don't doubt about the usefulness or stability of the approach, the concern is that it does not generalize well beyond your use-case of integer values. Even for integer values, the semantics for upper and lower would fail if you consider negative integers, where the byte sequence for negative values is lexicographically after those of positive values. Please also note that we could not yet update the RocksDB version, because they had a performance regression in the merge operator, but this should be fixed soon. |
Hello, any more thoughts on implementing ordering and filtering of MapState? This would really boost the inner join performance. |
(The sections below can be removed for hotfixes of typos)
-->
What is the purpose of the change
This pull request optimizes the seek of RocksDBMapState's entries by assigning lowerBound and upperBound.
Brief change log
-*Use entries(lowerBound, upperBound) instead of entries() in IntervalJoin.java when get buffer's values.
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation