Skip to content

[SPARK-31344][CORE] Polish implementation of barrier() and allGather()#28117

Closed
Ngone51 wants to merge 5 commits intoapache:masterfrom
Ngone51:refactor_barrier
Closed

[SPARK-31344][CORE] Polish implementation of barrier() and allGather()#28117
Ngone51 wants to merge 5 commits intoapache:masterfrom
Ngone51:refactor_barrier

Conversation

@Ngone51
Copy link
Member

@Ngone51 Ngone51 commented Apr 4, 2020

What changes were proposed in this pull request?

  1. Combine BarrierRequestToSync and AllGatherRequestToSync into RequestToSync, which is distinguished by RequestMethod type.

  2. Remove unnecessary Json serialization/deserialization

  3. Clean up some codes to make runBarrier() and BarrierCoordinator more general

  4. Remove unused imports.

Why are the changes needed?

To make codes simpler for better maintain in the future.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

This is pure code refactor, so should be covered by existed tests.

@Ngone51
Copy link
Member Author

Ngone51 commented Apr 4, 2020

This PR tried to address comment #27395 (review) , but please note that we can not simply build barrier() on top of allGather(""), because we need a way do distinguish these two different calls to prevent following illegal case:

rdd.barrier().mapPartitions { iter =>
   val context = BarrierTaskContext.get()
   val partitionId = context.partitionId
   if (partitionId == 0) {
     context.barrier()
   } else {
     context.allGather(partitionId.toString)
   }
  iter
 }

ping @sarthfrey @jiangxb1987 @mengxr Please take a look, thanks!

@SparkQA
Copy link

SparkQA commented Apr 4, 2020

Test build #120795 has finished for PR 28117 at commit 2f18910.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 4, 2020

Test build #120796 has finished for PR 28117 at commit f95965e.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 4, 2020

Test build #120797 has finished for PR 28117 at commit 617b80a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

private[spark] object RequestMethod extends Enumeration {
val BARRIER, ALL_GATHER = Value
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove redundant empty line.

Copy link
Contributor

@sarthfrey sarthfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning this up @Ngone51, writing back the messages individually rather than parsing JSON + keeping a single case class seems to have removed a lot of bloat :)

()
}
def barrier(): Unit = runBarrier("", RequestMethod.BARRIER)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra line


// An Array of allGather messages for barrier tasks that have made a blocking runBarrier() call
private val allGatherMessages: ArrayBuffer[String] = new Array[String](numTasks).to[ArrayBuffer]
// Messages from each barrier task that have made a blocking runBarrier() call. And it will be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: And it -> The messages

@SparkQA
Copy link

SparkQA commented Apr 7, 2020

Test build #120905 has finished for PR 28117 at commit e8d774b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@Ngone51
Copy link
Member Author

Ngone51 commented Apr 9, 2020

Ping @mengxr @jiangxb1987

@jiangxb1987
Copy link
Contributor

LGTM, I'll wait for another couple days before merge this PR.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 15, 2020

Test build #121307 has finished for PR 28117 at commit e8d774b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 15, 2020

Test build #121332 has finished for PR 28117 at commit e8d774b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 16, 2020

Test build #121372 has finished for PR 28117 at commit e8d774b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 17, 2020

Test build #121382 has finished for PR 28117 at commit e8d774b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 17, 2020

Test build #121384 has finished for PR 28117 at commit e8d774b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

Thanks, merged to master!

@Ngone51
Copy link
Member Author

Ngone51 commented Apr 17, 2020

thanks all!

Ngone51 added a commit to Ngone51/spark that referenced this pull request Apr 21, 2020
### What changes were proposed in this pull request?

1. Combine  `BarrierRequestToSync` and `AllGatherRequestToSync` into `RequestToSync`, which is distinguished by `RequestMethod` type.

2. Remove unnecessary Json serialization/deserialization

3. Clean up some codes to make runBarrier() and `BarrierCoordinator` more general

4. Remove unused imports.

### Why are the changes needed?

To make codes simpler for better maintain in the future.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

This is pure code refactor, so should be covered by existed tests.

Closes apache#28117 from Ngone51/refactor_barrier.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants