Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4 #22240

Closed
wants to merge 8 commits into from

Conversation

mengxr
Copy link
Contributor

@mengxr mengxr commented Aug 27, 2018

What changes were proposed in this pull request?

I made one pass over barrier APIs added to Spark 2.4 and updates some scopes and docs. I will update Python docs once Scala doc was reviewed.

One major issue is that BarrierTaskContext implements TaskContextImpl that exposes some public methods. And internally there were several direct references to TaskContextImpl methods instead of TaskContext. This PR moved some methods from TaskContextImpl to TaskContext, remaining package private, and used delegate methods to avoid inheriting TaskContextImp and exposing unnecessary APIs.

TODOs:

override val stageId: Int,
override val stageAttemptNumber: Int,
override val partitionId: Int,
override val taskAttemptId: Long,
override val attemptNumber: Int,
override val taskMemoryManager: TaskMemoryManager,
private[spark] override val taskMemoryManager: TaskMemoryManager,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not exposed by TaskContext.

@@ -68,7 +74,7 @@ class BarrierTaskContext(
*
* CAUTION! In a barrier stage, each task must have the same number of barrier() calls, in all
* possible code branches. Otherwise, you may get the job hanging or a SparkException after
* timeout. Some examples of misuses listed below:
* timeout. Some examples of '''misuses''' listed below:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use bold font to make sure users don't misread

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: listed -> are listed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just saw it, will include it if Jenkins fails:)

*/
@Experimental
@Since("2.4.0")
class BarrierTaskContext private[spark] (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the constructor package private to force users get it from #get().

@SparkQA
Copy link

SparkQA commented Aug 27, 2018

Test build #95276 has finished for PR 22240 at commit 19e380a.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -28,4 +28,4 @@ import org.apache.spark.annotation.{Experimental, Since}
*/
@Experimental
@Since("2.4.0")
class BarrierTaskInfo(val address: String)
class BarrierTaskInfo private[spark] (val address: String)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hide the constructor since this is not to be constructed by user

*/
@Experimental
@Since("2.4.0")
class RDDBarrier[T: ClassTag] private[spark] (rdd: RDD[T]) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also hide the constructor here

@SparkQA
Copy link

SparkQA commented Aug 27, 2018

Test build #95280 has finished for PR 22240 at commit c2624ed.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr mengxr changed the title [WIP] [SPARK-25248] [CORE] Audit barrier APIs for 2.4 [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4 Aug 28, 2018
@SparkQA
Copy link

SparkQA commented Aug 28, 2018

Test build #95329 has finished for PR 22240 at commit 7e2975b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Aug 28, 2018

Test build #95361 has finished for PR 22240 at commit 7e2975b.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 28, 2018

Test build #95379 has finished for PR 22240 at commit 365e7b8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor Author

mengxr commented Aug 29, 2018

test this please

@SparkQA
Copy link

SparkQA commented Aug 29, 2018

Test build #95398 has finished for PR 22240 at commit 365e7b8.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Aug 29, 2018

retest this please

asfgit pushed a commit that referenced this pull request Aug 29, 2018
## What changes were proposed in this pull request?

I made one pass over the Python APIs for barrier mode and updated them to match the Scala doc in #22240 . Major changes:

* export the public classes
* expand the docs
* add doc for BarrierTaskInfo.addresss

cc: jiangxb1987

Closes #22261 from mengxr/SPARK-25248.1.

Authored-by: Xiangrui Meng <meng@databricks.com>
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@SparkQA
Copy link

SparkQA commented Aug 29, 2018

Test build #95418 has finished for PR 22240 at commit 365e7b8.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -82,31 +82,22 @@ private[spark] abstract class Task[T](
SparkEnv.get.blockManager.registerTask(taskAttemptId)
// TODO SPARK-24874 Allow create BarrierTaskContext based on partitions, instead of whether
// the stage is barrier.
context = if (isBarrier) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context is still used in the following statements.

object BarrierTaskContext {
/**
* :: Experimental ::
* Return the currently active BarrierTaskContext. This can be called inside of user functions to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Return -> Returns?

fjh100456 pushed a commit to fjh100456/spark that referenced this pull request Aug 31, 2018
## What changes were proposed in this pull request?

I made one pass over the Python APIs for barrier mode and updated them to match the Scala doc in apache#22240 . Major changes:

* export the public classes
* expand the docs
* add doc for BarrierTaskInfo.addresss

cc: jiangxb1987

Closes apache#22261 from mengxr/SPARK-25248.1.

Authored-by: Xiangrui Meng <meng@databricks.com>
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95639 has finished for PR 22240 at commit b6a3c5b.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95640 has finished for PR 22240 at commit ae4a8e6.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95641 has finished for PR 22240 at commit 47ebd08.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 4, 2018

Test build #95642 has finished for PR 22240 at commit c61eec3.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 4, 2018

Test build #95648 has finished for PR 22240 at commit 9b6a47b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Sep 4, 2018

Test build #95661 has finished for PR 22240 at commit 9b6a47b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

LGTM

@mengxr
Copy link
Contributor Author

mengxr commented Sep 4, 2018

Merged into master. Thanks for review!

@asfgit asfgit closed this in 061bb01 Sep 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants