[SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index #27716

gengliangwang · 2020-02-26T22:59:16Z

What changes were proposed in this pull request?

Spark uses the class InMemoryStore as the KV storage for live UI and history server(by default if no LevelDB file path is provided).
In InMemoryStore, all the task data in one application is stored in a hashmap, which key is the task ID and the value is the task data. This fine for getting or deleting with a provided task ID.
However, Spark stage UI always shows all the task data in one stage and the current implementation is to look up all the values in the hashmap. The time complexity is O(numOfTasks).
Also, when there are too many stages (>spark.ui.retainedStages), Spark will linearly try to look up all the task data of the stages to be deleted as well.

This can be very bad for a large application with many stages and tasks. We can improve it by allowing the natural key of an entity to have a real parent index. So that on each lookup with parent node provided, Spark can look up all the natural keys(in our case, the task IDs) first, and then find the data with the natural keys in the hashmap.

Why are the changes needed?

The in-memory KV store becomes really slow for large applications. We can improve it with a new index. The performance can be 10 times, 100 times, even 1000 times faster.
This is also possible to make the Spark driver more stable for large applications.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests.
Also, I run a benchmark with the following code

  val store = new InMemoryStore()
  val numberOfTasksPerStage = 10000
   (0 until 1000).map { sId =>
     (0 until numberOfTasksPerStage).map { taskId =>
       val task = newTaskData(sId * numberOfTasksPerStage + taskId, "SUCCESS", sId)
       store.write(task)
     }
   }
  val appStatusStore = new AppStatusStore(store)
  var start = System.nanoTime()
  appStatusStore.taskSummary(2, attemptId, Array(0, 0.25, 0.5, 0.75, 1))
  println("task summary run time: " + ((System.nanoTime() - start) / 1000000))
  val stageIds = Seq(1, 11, 66, 88)
  val stageKeys = stageIds.map(Array(_, attemptId))
  start = System.nanoTime()
  store.removeAllByIndexValues(classOf[TaskDataWrapper], TaskIndexNames.STAGE,
    stageKeys.asJavaCollection)
   println("clean up tasks run time: " + ((System.nanoTime() - start) / 1000000))

Task summary before the changes: 98642ms
Task summary after the changes: 120ms

Task clean up before the changes: 4900ms
Task clean up before the changes: 4ms

It's 800x faster after the changes in the micro-benchmark.

SparkQA · 2020-02-26T23:17:18Z

Test build #118993 has finished for PR 27716 at commit 4f93ffc.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T02:16:10Z

Test build #118995 has finished for PR 27716 at commit b0bb448.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T02:20:14Z

Test build #118997 has finished for PR 27716 at commit d50c801.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2020-02-27T02:44:43Z

retest this please

SparkQA · 2020-02-27T03:19:14Z

Test build #118994 has finished for PR 27716 at commit c6c2c82.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T03:40:47Z

Test build #118998 has finished for PR 27716 at commit 9cb44c3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T03:45:06Z

Test build #118996 has finished for PR 27716 at commit 8969a4d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T04:38:13Z

Test build #118999 has finished for PR 27716 at commit da463e9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T05:24:43Z

Test build #119005 has finished for PR 27716 at commit da463e9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2020-02-27T07:33:03Z

retest this please.

SparkQA · 2020-02-27T08:05:02Z

Test build #119015 has finished for PR 27716 at commit da463e9.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-02-27T11:49:49Z

Test build #119018 has finished for PR 27716 at commit 753a14a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java

SparkQA · 2020-02-28T21:29:42Z

Test build #119099 has finished for PR 27716 at commit 091fb7e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2020-02-29T00:41:01Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java

+        for (NaturalKeys v : parentToChildrenMap.values()) {
+          if (v.remove(asKey(key))) {
+            break;
+          }
+        }


When a parent key in parentToChildrenMap points to empty NaturalKeys, we can also remove it?

Yes, nothing will change if the NaturalKeys v doesn't contain key.

Oh, I meant after v.remove(asKey(key)), if v is empty, can we remove the (parent key, empty NaturalKeys) from parentToChildrenMap?

Well, parentToChildrenMap is a concurrent map and checking emptiness costs time.
The method here is to delete one entry. I think we can make it simple and keep it this way.

viirya · 2020-02-29T01:37:08Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java

+        Comparable<Object> parentKey = asKey(parent);
+        if (!naturalParentIndexName.isEmpty() &&
+          naturalParentIndexName.equals(ti.getParentIndexName(index))) {
+          // If there is a parent index for the natural index and the parent of`index` happens to be


of`index` -> of `index`

viirya · 2020-02-29T01:40:12Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java

-          .collect(Collectors.toList());
+        Comparable<Object> parentKey = asKey(parent);
+        if (!naturalParentIndexName.isEmpty() &&
+          naturalParentIndexName.equals(ti.getParentIndexName(index))) {


Is it possible that naturalParentIndexName doesn't equal to ti.getParentIndexName(index)? Isn't String index = KVIndex.NATURAL_INDEX_NAME?

It is possible. I have explained in https://github.com/apache/spark/pull/27716/files#r385846069.

SparkQA · 2020-02-29T08:05:02Z

Test build #119116 has finished for PR 27716 at commit e02acc0.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2020-02-29T09:06:36Z

retest this please.

SparkQA · 2020-02-29T15:48:55Z

Test build #119119 has finished for PR 27716 at commit 491e9eb.

This patch fails from timeout after a configured wait of 400m.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2020-02-29T20:50:44Z

retest this please.

SparkQA · 2020-03-01T03:37:49Z

Test build #119130 has finished for PR 27716 at commit 491e9eb.

This patch fails from timeout after a configured wait of 400m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-03-01T05:06:09Z

Test build #119134 has finished for PR 27716 at commit c0d3755.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2020-03-01T07:50:36Z

retest this please.

SparkQA · 2020-03-01T10:29:45Z

Test build #119140 has finished for PR 27716 at commit c0d3755.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-03-02T07:48:55Z

linear scan all the tasks data to look up only one stage looks like a performance issue to me, and we should fix it in 3.0 as well.

Thanks, merging to master/3.0!

### What changes were proposed in this pull request? Spark uses the class `InMemoryStore` as the KV storage for live UI and history server(by default if no LevelDB file path is provided). In `InMemoryStore`, all the task data in one application is stored in a hashmap, which key is the task ID and the value is the task data. This fine for getting or deleting with a provided task ID. However, Spark stage UI always shows all the task data in one stage and the current implementation is to look up all the values in the hashmap. The time complexity is O(numOfTasks). Also, when there are too many stages (>spark.ui.retainedStages), Spark will linearly try to look up all the task data of the stages to be deleted as well. This can be very bad for a large application with many stages and tasks. We can improve it by allowing the natural key of an entity to have a real parent index. So that on each lookup with parent node provided, Spark can look up all the natural keys(in our case, the task IDs) first, and then find the data with the natural keys in the hashmap. ### Why are the changes needed? The in-memory KV store becomes really slow for large applications. We can improve it with a new index. The performance can be 10 times, 100 times, even 1000 times faster. This is also possible to make the Spark driver more stable for large applications. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing unit tests. Also, I run a benchmark with the following code ``` val store = new InMemoryStore() val numberOfTasksPerStage = 10000 (0 until 1000).map { sId => (0 until numberOfTasksPerStage).map { taskId => val task = newTaskData(sId * numberOfTasksPerStage + taskId, "SUCCESS", sId) store.write(task) } } val appStatusStore = new AppStatusStore(store) var start = System.nanoTime() appStatusStore.taskSummary(2, attemptId, Array(0, 0.25, 0.5, 0.75, 1)) println("task summary run time: " + ((System.nanoTime() - start) / 1000000)) val stageIds = Seq(1, 11, 66, 88) val stageKeys = stageIds.map(Array(_, attemptId)) start = System.nanoTime() store.removeAllByIndexValues(classOf[TaskDataWrapper], TaskIndexNames.STAGE, stageKeys.asJavaCollection) println("clean up tasks run time: " + ((System.nanoTime() - start) / 1000000)) ``` Task summary before the changes: 98642ms Task summary after the changes: 120ms Task clean up before the changes: 4900ms Task clean up before the changes: 4ms It's 800x faster after the changes in the micro-benchmark. Closes #27716 from gengliangwang/liveUIStore. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 6b64143) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

gengliangwang · 2020-03-02T09:49:06Z

@cloud-fan @viirya Thanks for the review !

…p when removing key from CountingRemoveIfForEach ### What changes were proposed in this pull request? This patch addresses missed spot on SPARK-30964 (#27716) - SPARK-30964 added secondary index which defines the relationship between parent - children and able to operate all children for given parent faster. While SPARK-30964 handled the addition and deletion of secondary index in InstanceList properly, it missed to add code to handle deletion of secondary index in CountingRemoveIfForEach, resulting to the leak of indices. This patch adds the deletion of secondary index in CountingRemoveIfForEach. ### Why are the changes needed? Described above. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A, as relevant field and class are marked as private, and it cannot be checked in higher level. I'm not sure we want to adjust scope to add a test. Closes #27765 from HeartSaVioR/SPARK-31014. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

…p when removing key from CountingRemoveIfForEach ### What changes were proposed in this pull request? This patch addresses missed spot on SPARK-30964 (apache#27716) - SPARK-30964 added secondary index which defines the relationship between parent - children and able to operate all children for given parent faster. While SPARK-30964 handled the addition and deletion of secondary index in InstanceList properly, it missed to add code to handle deletion of secondary index in CountingRemoveIfForEach, resulting to the leak of indices. This patch adds the deletion of secondary index in CountingRemoveIfForEach. ### Why are the changes needed? Described above. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A, as relevant field and class are marked as private, and it cannot be checked in higher level. I'm not sure we want to adjust scope to add a test. Closes apache#27765 from HeartSaVioR/SPARK-31014. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

…renMap when removing key from CountingRemoveIfForEach ### What changes were proposed in this pull request? This patch addresses missed spot on SPARK-30964 (#27716) - SPARK-30964 added secondary index which defines the relationship between parent - children and able to operate all children for given parent faster. While SPARK-30964 handled the addition and deletion of secondary index in InstanceList properly, it missed to add code to handle deletion of secondary index in CountingRemoveIfForEach, resulting to the leak of indices. This patch adds the deletion of secondary index in CountingRemoveIfForEach. ### Why are the changes needed? Described above. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A, as relevant field and class are marked as private, and it cannot be checked in higher level. I'm not sure we want to adjust scope to add a test. Closes #27825 from HeartSaVioR/SPARK-31014-branch-3.0. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

### What changes were proposed in this pull request? Spark uses the class `InMemoryStore` as the KV storage for live UI and history server(by default if no LevelDB file path is provided). In `InMemoryStore`, all the task data in one application is stored in a hashmap, which key is the task ID and the value is the task data. This fine for getting or deleting with a provided task ID. However, Spark stage UI always shows all the task data in one stage and the current implementation is to look up all the values in the hashmap. The time complexity is O(numOfTasks). Also, when there are too many stages (>spark.ui.retainedStages), Spark will linearly try to look up all the task data of the stages to be deleted as well. This can be very bad for a large application with many stages and tasks. We can improve it by allowing the natural key of an entity to have a real parent index. So that on each lookup with parent node provided, Spark can look up all the natural keys(in our case, the task IDs) first, and then find the data with the natural keys in the hashmap. ### Why are the changes needed? The in-memory KV store becomes really slow for large applications. We can improve it with a new index. The performance can be 10 times, 100 times, even 1000 times faster. This is also possible to make the Spark driver more stable for large applications. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing unit tests. Also, I run a benchmark with the following code ``` val store = new InMemoryStore() val numberOfTasksPerStage = 10000 (0 until 1000).map { sId => (0 until numberOfTasksPerStage).map { taskId => val task = newTaskData(sId * numberOfTasksPerStage + taskId, "SUCCESS", sId) store.write(task) } } val appStatusStore = new AppStatusStore(store) var start = System.nanoTime() appStatusStore.taskSummary(2, attemptId, Array(0, 0.25, 0.5, 0.75, 1)) println("task summary run time: " + ((System.nanoTime() - start) / 1000000)) val stageIds = Seq(1, 11, 66, 88) val stageKeys = stageIds.map(Array(_, attemptId)) start = System.nanoTime() store.removeAllByIndexValues(classOf[TaskDataWrapper], TaskIndexNames.STAGE, stageKeys.asJavaCollection) println("clean up tasks run time: " + ((System.nanoTime() - start) / 1000000)) ``` Task summary before the changes: 98642ms Task summary after the changes: 120ms Task clean up before the changes: 4900ms Task clean up before the changes: 4ms It's 800x faster after the changes in the micro-benchmark. Closes apache#27716 from gengliangwang/liveUIStore. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…p when removing key from CountingRemoveIfForEach ### What changes were proposed in this pull request? This patch addresses missed spot on SPARK-30964 (apache#27716) - SPARK-30964 added secondary index which defines the relationship between parent - children and able to operate all children for given parent faster. While SPARK-30964 handled the addition and deletion of secondary index in InstanceList properly, it missed to add code to handle deletion of secondary index in CountingRemoveIfForEach, resulting to the leak of indices. This patch adds the deletion of secondary index in CountingRemoveIfForEach. ### Why are the changes needed? Described above. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A, as relevant field and class are marked as private, and it cannot be checked in higher level. I'm not sure we want to adjust scope to add a test. Closes apache#27765 from HeartSaVioR/SPARK-31014. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

…with InMemoryStore ### What changes were proposed in this pull request? #27716 introduced parent index for InMemoryStore. When the method "deleteParentIndex(Object key)" in InMemoryStore.java is called and the key is not contained in "NaturalKeys v", A java.lang.NullPointerException will be thrown. This patch fixed the issue by updating the if condition. ### Why are the changes needed? Fixed a minor bug. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a unit test for deleteParentIndex. Closes #28378 from baohe-zhang/SPARK-31584. Authored-by: Baohe Zhang <baohe.zhang@verizonmedia.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

…with InMemoryStore ### What changes were proposed in this pull request? #27716 introduced parent index for InMemoryStore. When the method "deleteParentIndex(Object key)" in InMemoryStore.java is called and the key is not contained in "NaturalKeys v", A java.lang.NullPointerException will be thrown. This patch fixed the issue by updating the if condition. ### Why are the changes needed? Fixed a minor bug. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a unit test for deleteParentIndex. Closes #28378 from baohe-zhang/SPARK-31584. Authored-by: Baohe Zhang <baohe.zhang@verizonmedia.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com> (cherry picked from commit 3808014) Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>

Accelerate InMemoryStore with a new index

4f93ffc

gengliangwang requested review from vanzin and zsxwing February 26, 2020 22:59

gengliangwang added 3 commits February 26, 2020 15:27

Fix java style

c6c2c82

Alias type

b0bb448

fix lint java

8969a4d

gengliangwang requested review from jiangxb1987 and srowen February 27, 2020 00:01

revise

d50c801

gengliangwang removed the request for review from zsxwing February 27, 2020 00:04

gengliangwang added 2 commits February 26, 2020 17:02

revise coding style

9cb44c3

fix one trivial indent

da463e9

add comment and revise style

753a14a