Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34125][CORE][2.4] Make EventLoggingListener.codecMap thread-safe #31194

Closed
wants to merge 3 commits into from

Conversation

cxzl25
Copy link
Contributor

@cxzl25 cxzl25 commented Jan 15, 2021

What changes were proposed in this pull request?

EventLoggingListener.codecMap change mutable.HashMap to ConcurrentHashMap

Why are the changes needed?

2.x version of history server
EventLoggingListener.codecMap is of type mutable.HashMap, which is not thread safe.
This will cause the history server to suddenly get stuck and not work.
The 3.x version was changed to EventLogFileReader.codecMap to ConcurrentHashMap type, so there is no such problem.(SPARK-28869)

Multiple threads call openEventLog, codecMap is updated by multiple threads, mutable.HashMap may fall into an infinite loop during resize, resulting in history server not working.
scala/bug#10436

PID 117049 0x1c939
image

image

Does this PR introduce any user-facing change?

No

How was this patch tested?

exist ut

@cxzl25 cxzl25 changed the title [SPARK-34125][SQL][2.4] Make EventLoggingListener.codecMap thread-safe [SPARK-34125][CORE][2.4] Make EventLoggingListener.codecMap thread-safe Jan 15, 2021
@HeartSaVioR
Copy link
Contributor

ok to test

@HeartSaVioR
Copy link
Contributor

Could we please elaborate more on the impact of the issue in Why are the changes needed? via linking Scala issue as we did in JIRA issue?

@SparkQA
Copy link

SparkQA commented Jan 15, 2021

Test build #134104 has finished for PR 31194 at commit 076e28e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 15, 2021

Test build #134117 has started for PR 31194 at commit 2b94425.

@SparkQA
Copy link

SparkQA commented Jan 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38700/

@SparkQA
Copy link

SparkQA commented Jan 15, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38700/

@srowen
Copy link
Member

srowen commented Jan 15, 2021

Does this not affect master / 3.x?

@HeartSaVioR
Copy link
Contributor

Yes, we changed it already in Spark 3.0, but as a part of commit and we haven't realized it should be fixed in 2.4.x as well.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @cxzl25 and all.

@HeartSaVioR
Copy link
Contributor

retest this, please

@HeartSaVioR
Copy link
Contributor

Just triggered Jenkins/GA. Once one of them passes I'll merge this.

@SparkQA
Copy link

SparkQA commented Jan 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38715/

@SparkQA
Copy link

SparkQA commented Jan 16, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38715/

@SparkQA
Copy link

SparkQA commented Jan 16, 2021

Test build #134132 has finished for PR 31194 at commit 2b94425.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

FYI, the first commit already passed with Jenkins here. And, the second commit is an empty commit to retriever Jenkins.

@SparkQA
Copy link

SparkQA commented Jan 16, 2021

Test build #134140 has finished for PR 31194 at commit 5ed0ee9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor

Test passed. Merging.

HeartSaVioR pushed a commit that referenced this pull request Jan 18, 2021
### What changes were proposed in this pull request?
`EventLoggingListener.codecMap` change `mutable.HashMap` to `ConcurrentHashMap`

### Why are the changes needed?
2.x version of history server
`EventLoggingListener.codecMap` is of type mutable.HashMap, which is not thread safe.
This will cause the history server to suddenly get stuck and not work.
The 3.x version was changed to `EventLogFileReader.codecMap` to `ConcurrentHashMap` type, so there is no such problem.(SPARK-28869)

Multiple threads call `openEventLog`, `codecMap` is updated by multiple threads, `mutable.HashMap` may fall into an infinite loop during `resize`, resulting in history server not working.
scala/bug#10436

PID 117049 0x1c939
![image](https://user-images.githubusercontent.com/3898450/104753904-9239c280-5793-11eb-8a2d-89324ccfb92c.png)

![image](https://user-images.githubusercontent.com/3898450/104753921-9534b300-5793-11eb-99e6-51ac66051d2a.png)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
exist ut

Closes #31194 from cxzl25/SPARK-34125.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
@HeartSaVioR
Copy link
Contributor

Thanks @cxzl25 for the contribution! Merged to branch-2.4.

@dongjoon-hyun
Copy link
Member

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants