Skip to content

Conversation

JackCaoG
Copy link
Collaborator

…yout is pin

In PyTorch/XLA we need to pin all layouts for cross-core-communciation ops since program is built separately for every core. However there is not an easy way to pin the layout for all-gather on TPU. In this pr I changed the all-gather is to use all-reduce when user actually wants to pin the layout. I also changed the layout-pinning to True by defualt. It was set to False because I can't pin all-gather.

@hjm-aws I am guessing you guys still prefer to use all_gather directly so I set the pin_layout to False for all xla_process_group. Let me know if that works for you.

FYI @hjm-aws @ronghanghu

@JackCaoG JackCaoG requested review from miladm and hjm-aws May 14, 2022 00:45
@JackCaoG JackCaoG requested a review from hjm-aws May 18, 2022 18:42
@JackCaoG
Copy link
Collaborator Author

@hjm-aws If this looks OK to you I will merge it today.

@JackCaoG JackCaoG merged commit 7f96fbf into master May 18, 2022
@JackCaoG JackCaoG deleted the all_gather_use_all_reduce branch May 18, 2022 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants