-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6521][Core]executors in the same node read local shuffle file #5178
Conversation
ok to test |
blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, 0)) | ||
def getDataFile(shuffleId: Int, | ||
mapId: Int, | ||
blockManagerId: BlockManagerId = blockManager.blockManagerId): File = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @viper-kun the style here and other places should be:
def getDataFile(
shuffleId: Int,
mapId: Int,
blockManagerId: BlockManagerId = ...): File = {
...
}
Test build #29167 has finished for PR 5178 at commit
|
blockManagerInfo.filter(info => (info._1 != blockManagerId && info._1.host == blockManagerId.host)) | ||
.map { case(blockManagerId, info) => | ||
(blockManagerId, info.localDirsPath) | ||
}.toMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style
blockManagerInfo
.filter { case (id, _) => (id != blockManagerId && id.host == blockManagerId.host) }
.mapValues { info => info.localDirsPath }
.toMap
Hi @viper-kun thanks for working on this. However, it seems that there are quite a few style violations. For more detail please see https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide or https://github.com/databricks/scala-style-guide. Once you fix those I will do a closer review. |
Test build #29217 timed out for PR 5178 at commit |
Test build #29218 timed out for PR 5178 at commit |
Test build #29274 timed out for PR 5178 at commit |
Hi @andrewor14. pls retest it, test build time out. |
Jenkins, test this please. |
Test build #29296 timed out for PR 5178 at commit |
Jenkins, test this please. |
Test build #29346 has finished for PR 5178 at commit
|
Test build #29361 has finished for PR 5178 at commit
|
Test build #29363 has finished for PR 5178 at commit
|
Test build #29385 has started for PR 5178 at commit |
Test build #29381 has finished for PR 5178 at commit
|
Jenkins, test this please. |
Test build #29391 has started for PR 5178 at commit |
@andrewor14 Pls review it. Thanks |
private def getLocalDirsPath( | ||
blockManagerId: BlockManagerId): Map[BlockManagerId, Array[String]] = { | ||
blockManagerInfo | ||
.filter { case(id, _) => (id != blockManagerId && id.host == blockManagerId.host)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Unnecessary parentheses.
One question; are there many cases for executors to share a single host in the Yarn mode? |
@maropu , yeah i think it is a common case for yarn mode. We often specify more executors than nodemanager, that means there are more than one executor on one machine. |
Understood. |
curRequestSize = 0 | ||
} | ||
if (shuffleMgrName.toLowerCase == "hash" || externalShuffleServiceEnabled) { | ||
for ((address, blockInfos) <- blocksByAddress) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check if SortShuffleManager is used because this patch only supports it. This logic fails when new shuffle managers will be implemented.
@viper-kun What's the status of this patch? If you don't make further updates, I'd like to brush up this patch. |
@maropu I will update it. |
@viper-kun @maropu any updates? Should we take over? I would recommend that we close this patch since it's mostly gone stale. We can always reopen an updated version later. |
Can one of the admins verify this patch? |
Let's close this PR and reopen it later if necessary. |
In the past, executor read other executor's shuffle file in the same node by net. This pr make that executors in the same node read local shuffle file In sort-based Shuffle. It will reduce net transport.