[SPARK-31013][Core][WebUI] InMemoryStore: improve removeAllByIndexValues over natural key index#27763
[SPARK-31013][Core][WebUI] InMemoryStore: improve removeAllByIndexValues over natural key index#27763gengliangwang wants to merge 2 commits intoapache:masterfrom
Conversation
|
Test build #119187 has finished for PR 27763 at commit
|
|
retest this please. |
common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java
Outdated
Show resolved
Hide resolved
|
Btw, out-of-topic, but while we are here, is it intentional or missing spot CountingRemoveIfForEach only removes the data and doesn't touch parentToChildrenMap? If it's just a missing spot maybe I could submit a PR to fix it. |
You are right. It is a missing part of #27716 . |
|
Thanks for confirming, @gengliangwang ! |
|
Test build #119191 has finished for PR 27763 at commit
|
|
retest this please |
|
Test build #119188 has finished for PR 27763 at commit
|
|
Just submitted a PR #27765. It might be possible to be conflicted - I'll rebase if necessary once this PR gets merged. |
|
Test build #119195 has finished for PR 27763 at commit
|
|
retest this please. |
|
Test build #119205 has finished for PR 27763 at commit
|
|
retest this please. |
|
Test build #119210 has finished for PR 27763 at commit
|
|
thanks, merging to master! |
…ues over natural key index ### What changes were proposed in this pull request? The method `removeAllByIndexValues` in KVStore is to delete all the objects which have certain values in the given index. However, in the current implementation of `InMemoryStore`, when the given index is the natural key index, there is no special handling for it and a linear search over all the task data is performed. We can improve it by deleting the natural keys directly from the internal hashmap. ### Why are the changes needed? Better performance if the given index for `removeAllByIndexValues` is the natural key index in `InMemoryStore` ### Does this PR introduce any user-facing change? No ### How was this patch tested? Enhance the existing test. Closes apache#27763 from gengliangwang/useNaturalIndex. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…ues over natural key index ### What changes were proposed in this pull request? The method `removeAllByIndexValues` in KVStore is to delete all the objects which have certain values in the given index. However, in the current implementation of `InMemoryStore`, when the given index is the natural key index, there is no special handling for it and a linear search over all the task data is performed. We can improve it by deleting the natural keys directly from the internal hashmap. ### Why are the changes needed? Better performance if the given index for `removeAllByIndexValues` is the natural key index in `InMemoryStore` ### Does this PR introduce any user-facing change? No ### How was this patch tested? Enhance the existing test. Closes apache#27763 from gengliangwang/useNaturalIndex. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
The method
removeAllByIndexValuesin KVStore is to delete all the objects which have certain values in the given index.However, in the current implementation of
InMemoryStore, when the given index is the natural key index, there is no special handling for it and a linear search over all the task data is performed.We can improve it by deleting the natural keys directly from the internal hashmap.
Why are the changes needed?
Better performance if the given index for
removeAllByIndexValuesis the natural key index inInMemoryStoreDoes this PR introduce any user-facing change?
No
How was this patch tested?
Enhance the existing test.