Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6521][Core]executors in the same node read local shuffle file #5178

Closed
wants to merge 1 commit into from
Closed

[SPARK-6521][Core]executors in the same node read local shuffle file #5178

wants to merge 1 commit into from

Conversation

viper-kun
Copy link
Contributor

In the past, executor read other executor's shuffle file in the same node by net. This pr make that executors in the same node read local shuffle file In sort-based Shuffle. It will reduce net transport.

@andrewor14
Copy link
Contributor

ok to test

blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, 0))
def getDataFile(shuffleId: Int,
mapId: Int,
blockManagerId: BlockManagerId = blockManager.blockManagerId): File = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @viper-kun the style here and other places should be:

def getDataFile(
    shuffleId: Int,
    mapId: Int,
    blockManagerId: BlockManagerId = ...): File = {
  ...
}

@SparkQA
Copy link

SparkQA commented Mar 25, 2015

Test build #29167 has finished for PR 5178 at commit 5b766ca.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster

blockManagerInfo.filter(info => (info._1 != blockManagerId && info._1.host == blockManagerId.host))
.map { case(blockManagerId, info) =>
(blockManagerId, info.localDirsPath)
}.toMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style

blockManagerInfo
  .filter { case (id, _) => (id != blockManagerId && id.host == blockManagerId.host) }
  .mapValues { info => info.localDirsPath }
  .toMap

@andrewor14
Copy link
Contributor

Hi @viper-kun thanks for working on this. However, it seems that there are quite a few style violations. For more detail please see https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide or https://github.com/databricks/scala-style-guide. Once you fix those I will do a closer review.

@SparkQA
Copy link

SparkQA commented Mar 26, 2015

Test build #29217 timed out for PR 5178 at commit 0eaf1af after a configured wait of 120m.

@SparkQA
Copy link

SparkQA commented Mar 26, 2015

Test build #29218 timed out for PR 5178 at commit 6162aef after a configured wait of 120m.

@SparkQA
Copy link

SparkQA commented Mar 27, 2015

Test build #29274 timed out for PR 5178 at commit bb94736 after a configured wait of 120m.

@viper-kun
Copy link
Contributor Author

Hi @andrewor14. pls retest it, test build time out.

@WangTaoTheTonic
Copy link
Contributor

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Mar 27, 2015

Test build #29296 timed out for PR 5178 at commit bb94736 after a configured wait of 120m.

@scwf
Copy link
Contributor

scwf commented Mar 28, 2015

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Mar 28, 2015

Test build #29346 has finished for PR 5178 at commit 9ddf84c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster

@SparkQA
Copy link

SparkQA commented Mar 29, 2015

Test build #29361 has finished for PR 5178 at commit 383a21f.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster

@SparkQA
Copy link

SparkQA commented Mar 29, 2015

Test build #29363 has finished for PR 5178 at commit 6ee8df2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29385 has started for PR 5178 at commit 7ae6e46.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29381 has finished for PR 5178 at commit c429b66.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster
  • This patch does not change any dependencies.

@scwf
Copy link
Contributor

scwf commented Mar 30, 2015

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29391 has started for PR 5178 at commit 6c5c1d4.

@viper-kun
Copy link
Contributor Author

@andrewor14 Pls review it. Thanks

private def getLocalDirsPath(
blockManagerId: BlockManagerId): Map[BlockManagerId, Array[String]] = {
blockManagerInfo
.filter { case(id, _) => (id != blockManagerId && id.host == blockManagerId.host)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Unnecessary parentheses.

@maropu
Copy link
Member

maropu commented Apr 3, 2015

One question; are there many cases for executors to share a single host in the Yarn mode?

@scwf
Copy link
Contributor

scwf commented Apr 5, 2015

@maropu , yeah i think it is a common case for yarn mode. We often specify more executors than nodemanager, that means there are more than one executor on one machine.

@maropu
Copy link
Member

maropu commented Apr 5, 2015

Understood.

curRequestSize = 0
}
if (shuffleMgrName.toLowerCase == "hash" || externalShuffleServiceEnabled) {
for ((address, blockInfos) <- blocksByAddress) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check if SortShuffleManager is used because this patch only supports it. This logic fails when new shuffle managers will be implemented.

@maropu
Copy link
Member

maropu commented Apr 17, 2015

@viper-kun What's the status of this patch? If you don't make further updates, I'd like to brush up this patch.

@viper-kun
Copy link
Contributor Author

@maropu I will update it.

@andrewor14
Copy link
Contributor

@viper-kun @maropu any updates? Should we take over? I would recommend that we close this patch since it's mostly gone stale. We can always reopen an updated version later.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@andrewor14
Copy link
Contributor

Let's close this PR and reopen it later if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants