Skip to content

Conversation

@fmassa
Copy link
Contributor

@fmassa fmassa commented Sep 6, 2025

Previously, we were only taking the current node OpSpec string to hash it. This includes the supported input/output placements, but doesn't account for its input node, which might have different output shardings. This PR fixes it, and also add an assert in the optimize_sharding to validate that the created links are consistent

Previously, we were only taking the current node OpSpec string to hash it. This includes the supported input/output placements, but doesn't account for its input node, which might have different output shardings. This PR fixes it, and also add an assert in the optimize_sharding to validate that the created links are consistent
@fmassa fmassa requested a review from ezyang September 6, 2025 06:55
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 6, 2025
@fmassa
Copy link
Contributor Author

fmassa commented Sep 6, 2025

TorchTitan failure is unrelated (dataset missing), merging this to unblock

@fmassa fmassa merged commit 006fe59 into main Sep 6, 2025
4 of 6 checks passed
@fmassa fmassa deleted the fmassa/fix_clustering branch September 6, 2025 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants