[SPARK-16941]Use concurrentHashMap instead of scala Map in SparkSQLOperationManager. #14534

SaintBacchus · 2016-08-08T06:15:42Z

What changes were proposed in this pull request?

ThriftServer will have some thread-safe problem in SparkSQLOperationManager.
Add a SynchronizedMap trait for the maps in it to avoid this problem.

Details in SPARK-16941

How was this patch tested?

NA

…id concurrency problem.

SparkQA · 2016-08-08T06:48:07Z

Test build #63346 has finished for PR 14534 at commit 4af58bc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-08-08T07:49:56Z

.../src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala

@@ -39,8 +39,10 @@ private[thriftserver] class SparkSQLOperationManager()
  val handleToOperation = ReflectionUtils
    .getSuperField[JMap[OperationHandle, Operation]](this, "handleToOperation")

-  val sessionToActivePool = Map[SessionHandle, String]()
-  val sessionToContexts = Map[SessionHandle, SQLContext]()
+  val sessionToActivePool = new mutable.HashMap[SessionHandle, String]()


I think the best practice is to use Java's ConcurrentHashMap. I seem to recall Scala's trait is deprecated or discouraged. Is this amount of synchronization sufficient?

Correct; SynchronizedMap has been deprecated since Scala 2.11.0 with this comment in the API docs: "Synchronization via traits is deprecated as it is inherently unreliable. Consider java.util.concurrent.ConcurrentHashMap as an alternative."

The title of this PR must be updated to match what is actually being done after the switch to use ConcurrentHashMap since we don't want the misleading "Add SynchronizedMap trait" to persist in the commit history.

SparkQA · 2016-08-08T08:55:37Z

Test build #63353 has finished for PR 14534 at commit a333269.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-08-08T11:18:38Z

Test build #63354 has finished for PR 14534 at commit 2591b50.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2016-08-08T22:37:53Z

.../src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala

-
-import scala.collection.mutable.Map
+import java.util.concurrent.ConcurrentHashMap
+import java.util.Map


let's rename this -- otherwise it is very confusing whether we are using a scala Map or a java Map

SparkQA · 2016-08-09T01:40:13Z

Test build #63400 has finished for PR 14534 at commit 592d817.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SaintBacchus · 2016-08-09T03:00:49Z

cc/ @srowen Is this OK?

srowen · 2016-08-09T08:47:24Z

...r/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala

@@ -206,15 +206,16 @@ private[hive] class SparkExecuteStatementOperation(
      statementId,
      parentSession.getUsername)
    sqlContext.sparkContext.setJobGroup(statementId, statement)
-    sessionToActivePool.get(parentSession.getSessionHandle).foreach { pool =>
+    val pool = sessionToActivePool.get(parentSession.getSessionHandle)
+    if(null != pool) {


if (pool != null)

SparkQA · 2016-08-09T09:43:38Z

Test build #63433 has finished for PR 14534 at commit 0a436a0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SaintBacchus · 2016-08-10T03:22:05Z

any other comment?

srowen · 2016-08-10T09:18:12Z

.../src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala

-    val sqlContext = sessionToContexts(parentSession.getSessionHandle)
+    val sqlContext = sessionToContexts.get(parentSession.getSessionHandle)
+    require(sqlContext != null, s"Session handle: ${parentSession.getSessionHandle} has not been" +
+      s" initialed or had already closed.")


One last tiny nit: initaled -> initialized

SparkQA · 2016-08-10T13:17:21Z

Test build #63531 has finished for PR 14534 at commit 4f27261.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-08-11T10:28:41Z

Merged to master

Add SynchronizedMap trait with Map in SparkSQLOperationManager to avo…

4af58bc

…id concurrency problem.

srowen reviewed Aug 8, 2016
View reviewed changes

Use ConcurrentHashMap instead of Scala Map

a333269

Fix scala style.

2591b50

rxin reviewed Aug 8, 2016
View reviewed changes

Rename Java map

592d817

SaintBacchus changed the title ~~[SPARK-16941]Add SynchronizedMap trait with Map in SparkSQLOperationManager.~~ [SPARK-16941]Use concurrentHashMap instead of scala Map in SparkSQLOperationManager. Aug 9, 2016

srowen reviewed Aug 9, 2016
View reviewed changes

Commit the comment.

0a436a0

srowen reviewed Aug 10, 2016
View reviewed changes

Fix typo

4f27261

asfgit closed this in a45fefd Aug 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-16941]Use concurrentHashMap instead of scala Map in SparkSQLOperationManager. #14534

[SPARK-16941]Use concurrentHashMap instead of scala Map in SparkSQLOperationManager. #14534

SaintBacchus commented Aug 8, 2016

SparkQA commented Aug 8, 2016

srowen Aug 8, 2016

markhamstra Aug 8, 2016

SparkQA commented Aug 8, 2016

SparkQA commented Aug 8, 2016

rxin Aug 8, 2016

SparkQA commented Aug 9, 2016

SaintBacchus commented Aug 9, 2016

srowen Aug 9, 2016

SparkQA commented Aug 9, 2016

SaintBacchus commented Aug 10, 2016

srowen Aug 10, 2016

SparkQA commented Aug 10, 2016

srowen commented Aug 11, 2016

[SPARK-16941]Use concurrentHashMap instead of scala Map in SparkSQLOperationManager. #14534

[SPARK-16941]Use concurrentHashMap instead of scala Map in SparkSQLOperationManager. #14534

Conversation

SaintBacchus commented Aug 8, 2016

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Aug 8, 2016

srowen Aug 8, 2016

Choose a reason for hiding this comment

markhamstra Aug 8, 2016

Choose a reason for hiding this comment

SparkQA commented Aug 8, 2016

SparkQA commented Aug 8, 2016

rxin Aug 8, 2016

Choose a reason for hiding this comment

SparkQA commented Aug 9, 2016

SaintBacchus commented Aug 9, 2016

srowen Aug 9, 2016

Choose a reason for hiding this comment

SparkQA commented Aug 9, 2016

SaintBacchus commented Aug 10, 2016

srowen Aug 10, 2016

Choose a reason for hiding this comment

SparkQA commented Aug 10, 2016

srowen commented Aug 11, 2016