Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37029][Shuffle] Modify the assignment logic of dirFetchRequests variables #34304

Closed
wants to merge 1 commit into from
Closed

Conversation

jinhai-cloud
Copy link
Contributor

What changes were proposed in this pull request?

In the ShuffleBlockFetcherIterator.fetchHostLocalBlocks method, we generate dirFetchRequests based on externalShuffleServiceEnabled. But in fact, the MapStatus object generated in the shuffle write phase had already generated the BlockManagerId object according to externalShuffleServiceEnabled in the BlockManager.initialize method.

So we don't need to judge it again.

Why are the changes needed?

Does this PR introduce any user-facing change?

No

How was this patch tested?

@github-actions github-actions bot added the CORE label Oct 17, 2021
@jinhai-cloud jinhai-cloud changed the title Modify the assignment logic of dirFetchRequests variables [SPARK-37029][Shuffle]Modify the assignment logic of dirFetchRequests variables Oct 17, 2021
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon HyukjinKwon changed the title [SPARK-37029][Shuffle]Modify the assignment logic of dirFetchRequests variables [SPARK-37029][Shuffle] Modify the assignment logic of dirFetchRequests variables Oct 18, 2021
hostLocalBlocksWithMissingDirs.keys.map(bmId => (bmId.host, bmId.port, Array(bmId))).toSeq
}
val dirFetchRequests = hostLocalBlocksWithMissingDirs.keys.map(bmId =>
(bmId.host, bmId.port, Array(bmId))).toSeq
Copy link
Member

@Ngone51 Ngone51 Oct 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a behavior difference after the change - previously, there would be only one call on getHostLocalDirs below. After this change, it becomes the number of hostLocalBlocksWithMissingDirs, which could introduce more RPC calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants