[SPARK-34573][SQL] Avoid global locking in SQLConf object for sqlConfEntries map #31689

sparkyengine · 2021-03-01T04:39:02Z

What changes were proposed in this pull request?

In the SQLConf object, the sqlConfEntries map is globally synchronized (it is a Java Collections.synchronizedMap): any operation, including a get, will need to acquire the lock.

An example of this is calling the DatatType.sameType method. This will trigger a check on SQLConf.get.caseSensitiveAnalysis. So every time we compare two datatypes with sameType, we hit a lock.

To avoid having multiple tasks locking on this, a better approach would be to use a map that does not lock on read (like a ConcurrentHashMap). This map implementation does not lock on read, and on write it only locks the map partially. The only lock that happens is on write on the same map key.

Why are the changes needed?

Multiple tasks performing any operation that directly or indirectly trigger a query to the SQLConf.sqlConfEntries map, will require acquiring a global lock on that map. Something as easy as calling DataType.sameType(...) would be locking on the global sqlConfEntries lock of the Collections.synchronizedMap.

Does this PR introduce any user-facing change?

No

How was this patch tested?

No functionality change. Existing unit tests run normally.

… lock Note that merge function is only called if old value is not null, so it will always fail the require check (as intended): V newValue = (oldValue == null) ? value : remappingFunction.apply(oldValue, value);

maropu · 2021-03-02T00:21:30Z

ok to test

maropu · 2021-03-02T00:22:34Z

This change could get away the lock wait you mentioned in the jira? If so, could you describe more in the PR description?

sparkyengine · 2021-03-02T01:27:28Z

This change could get away the lock wait you mentioned in the jira? If so, could you describe more in the PR description?

Added more descriptive PR text.

SparkQA · 2021-03-02T01:51:51Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40196/

SparkQA · 2021-03-02T02:25:46Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40196/

maropu · 2021-03-02T02:57:07Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

-  private[sql] val sqlConfEntries = java.util.Collections.synchronizedMap(
-    new java.util.HashMap[String, ConfigEntry[_]]())
+  private[sql] val sqlConfEntries =
+    new ConcurrentHashMap[String, ConfigEntry[_]]()


Seems a reasonable fix cc: @srowen @cloud-fan @viirya

srowen

OK pending tests

viirya

looks reasonable. pending test.

sparkyengine · 2021-03-02T05:36:19Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40196/

@srowen what does this mean? in jenkins the failure is:

Setting status of 4ffa2576782e410cef4d5cb3d8865362679ff11c to FAILURE with url https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40196/ and message: 'Build finished. '
FileNotFoundException means that the credentials Jenkins is using is probably wrong. Or the user account does not have write access to the repo.
org.kohsuke.github.GHFileNotFoundException: https://api.github.com/repos/apache/spark/statuses/4ffa2576782e410cef4d5cb3d8865362679ff11c {"message":"Not Found","documentation_url":"https://docs.github.com/rest/reference/repos#create-a-commit-status"}

SparkQA · 2021-03-02T05:38:44Z

Test build #135617 has finished for PR 31689 at commit 4ffa257.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2021-03-02T06:33:35Z

@srowen what does this mean? in jenkins the failure is:

The failure is not related to this PR, so it's okay just to ignore it.

maropu · 2021-03-02T06:37:08Z

Thanks! Merged to master.

HyukjinKwon · 2021-03-02T06:38:20Z

Yeah seems good

cloud-fan · 2021-03-02T08:59:05Z

late LGTM

avoid global locking in SQLConf for sqlConfEntries map

1928504

github-actions bot added the SQL label Mar 1, 2021

removing synchronized as it is not needed

5dc5268

sparkyengine changed the title ~~[SPARK-34573][SQL][CORE] avoid global locking in SQLConf object for sqlConfEntries map~~ [WIP][SPARK-34573][SQL][CORE] avoid global locking in SQLConf object for sqlConfEntries map Mar 1, 2021

sparkyengine changed the title ~~[WIP][SPARK-34573][SQL][CORE] avoid global locking in SQLConf object for sqlConfEntries map~~ [WIP][SPARK-34573][SQL] avoid global locking in SQLConf object for sqlConfEntries map Mar 1, 2021

sparkyengine changed the title ~~[WIP][SPARK-34573][SQL] avoid global locking in SQLConf object for sqlConfEntries map~~ [SPARK-34573][SQL] avoid global locking in SQLConf object for sqlConfEntries map Mar 1, 2021

maropu changed the title ~~[SPARK-34573][SQL] avoid global locking in SQLConf object for sqlConfEntries map~~ [SPARK-34573][SQL] Avoid global locking in SQLConf object for sqlConfEntries map Mar 2, 2021

maropu reviewed Mar 2, 2021

View reviewed changes

srowen approved these changes Mar 2, 2021

View reviewed changes

viirya reviewed Mar 2, 2021

View reviewed changes

maropu approved these changes Mar 2, 2021

View reviewed changes

maropu closed this in b13a4b8 Mar 2, 2021

[SPARK-34573][SQL] Avoid global locking in SQLConf object for sqlConfEntries map #31689

[SPARK-34573][SQL] Avoid global locking in SQLConf object for sqlConfEntries map #31689

Uh oh!

Conversation

sparkyengine commented Mar 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

maropu commented Mar 2, 2021

Uh oh!

maropu commented Mar 2, 2021

Uh oh!

sparkyengine commented Mar 2, 2021

Uh oh!

SparkQA commented Mar 2, 2021

Uh oh!

SparkQA commented Mar 2, 2021

Uh oh!

maropu Mar 2, 2021

Choose a reason for hiding this comment

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

viirya left a comment

Choose a reason for hiding this comment

Uh oh!

sparkyengine commented Mar 2, 2021

Uh oh!

SparkQA commented Mar 2, 2021

Uh oh!

maropu commented Mar 2, 2021

Uh oh!

maropu commented Mar 2, 2021

Uh oh!

HyukjinKwon commented Mar 2, 2021

Uh oh!

cloud-fan commented Mar 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

sparkyengine commented Mar 1, 2021 •

edited

Loading