Skip to content

[MINOR][CORE] Replace Scala forkjoin package to Java bundled one#24113

Closed
HeartSaVioR wants to merge 2 commits intoapache:masterfrom
HeartSaVioR:MINOR-replace-scala-forkjoin-package-to-java
Closed

[MINOR][CORE] Replace Scala forkjoin package to Java bundled one#24113
HeartSaVioR wants to merge 2 commits intoapache:masterfrom
HeartSaVioR:MINOR-replace-scala-forkjoin-package-to-java

Conversation

@HeartSaVioR
Copy link
Contributor

What changes were proposed in this pull request?

The patch proposes to replace deprecated Scala forkjoin package to Java bundled one.

More details on follow commit in Scala side: scala/scala@fa8012d

As of Scala 2.12 they're just alias of Java bundled one, and being deprecated.

How was this patch tested?

Jenkins build.

@srowen
Copy link
Member

srowen commented Mar 16, 2019

That seems OK; I recall some 2.11 vs 2.12 problems in trying to update this code, but I don't recall a reason this wouldn't work. The Java impl has been there from Java 8, or even 7.

@SparkQA
Copy link

SparkQA commented Mar 16, 2019

Test build #4628 has finished for PR 24113 at commit d3c3c74.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 16, 2019

Test build #103563 has finished for PR 24113 at commit d3c3c74.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -183,15 +182,15 @@ private[spark] object ThreadUtils {
/**
* Construct a new Scala ForkJoinPool with a specified max parallelism and name prefix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comment, too?

import scala.collection.mutable.ArrayBuffer
import scala.collection.parallel.ForkJoinTaskSupport
import scala.concurrent.forkjoin.ForkJoinPool
import scala.reflect.ClassTag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use ThreadUtils.newForkJoinPool in this class instead of ForkJoinPool directly?

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run find . -type f | grep -e "\.scala$" | xargs grep "ForkJoinPool" to check that no other class uses scala.concurrent.forkjoin.ForkJoinPool directly.

LGTM except for minor comments.

@SparkQA
Copy link

SparkQA commented Mar 17, 2019

Test build #4635 has started for PR 24113 at commit d3c3c74.

@SparkQA
Copy link

SparkQA commented Mar 17, 2019

Test build #103590 has finished for PR 24113 at commit 89e175e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* Construct a new ForkJoinPool with a specified max parallelism and name prefix.
*/
def newForkJoinPool(prefix: String, maxThreadNumber: Int): SForkJoinPool = {
def newForkJoinPool(prefix: String, maxThreadNumber: Int): ForkJoinPool = {
Copy link
Member

@dongjoon-hyun dongjoon-hyun Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this okay for Scala-2.11, too? We still need to support Scala-2.11, don't we?

According to SPARK-13398, this was a hack for Scala-2.11 because we can't use Java's ForkJoinPool directly in Scala 2.11 since it uses a ExecutionContext which reports system parallelism.

Hi, @holdenk . Could you give us some opinion on this? The old issue is resolved now? Are we safe to revert the old SPARK-13398?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed we are still supporting Scala-2.11. It wouldn't even compile for Scala-2.11 because of the parameter type of ForkJoinTaskSupport. Thanks for pointing it out!

@dongjoon-hyun
Copy link
Member

BTW, @HeartSaVioR . We had better have a JIRA ID for this because this looks like a revert of SPARK-13398.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wait for @holdenk 's opinion for a while.

@srowen
Copy link
Member

srowen commented Mar 18, 2019

Yeah I definitely recall some issue here, but I think it was the issue you cite and simply the fact that it is deprecated in 2.12. The issue you cite and described at #11310 I think is OK here as this explicitly sets the parallelism on the ForkJoinPool.

@HeartSaVioR
Copy link
Contributor Author

I just realized that the code would not compile with -Pscala-2.11 profile, since ForkJoinTaskSupport in Scala 2.11 requires Scala version of ForkJoinPool instead of Java bundled one - so we cannot do this unless we set lower bar of Scala to scala-2.12.

Btw, once we want to both support Scala 2.12 and Scala 2.13, current code will break since they look to simply remove the alias in Scala 2.13 (They seem to remove ForkJoinTaskSupport in scala 2.13, too). Maybe broader fix should be needed at that time then.

One interesting thing is, some claims performance regression on changing ForkJoinPool from Scala one to Java bundled one (which is done Scala 2.12), and Akka team decided to keep Scala 2.11 version of ForkJoinPool in their repository. I'm not sure I can follow the discussions and changes related to this (they stated that the perf difference came from using busy-wait), but the implementation of ForkJoinPool on Scala 2.11 and Java bundled one sounds to be different, and perf. regression might happen silently between Scala 2.11 vs Scala 2.12.

http://www.scala-archive.org/ForkJoinPool-ExecutionContext-Implicits-global-performance-problem-for-Scala-2-12-td4647851.html
scala/bug#10083
https://akka.io/blog/news/2017/04/13/akka-2.5.0-released
https://github.com/akka/akka/blob/master/akka-actor/src/main/java/akka/dispatch/forkjoin/ForkJoinPool.java

Anyway let me close the PR for now. Thanks all for spending time to review, and thanks @dongjoon-hyun to let me realize about supporting Scala versions.

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Mar 18, 2019

Btw, it would be ideal to have test matrix on Scala 2.11 vs Scala 2.12 in Jenkins which would find issue which works with Scala 2.12 and fails with 2.11 easier. (if we feel too expensive for this, passing build profile on build request would be possible alternative.)

@srowen
Copy link
Member

srowen commented Mar 18, 2019

Ah OK I thought you were saying this worked on 2.11 at the outset. Yeah this is why then. We are dropping 2.11 support shortly and can try this then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants