Skip to content

Conversation

leafs1
Copy link
Contributor

@leafs1 leafs1 commented Jul 25, 2025

Summary

With the current partitioner implementation, we need to traverse the partitions in reverse top sort order for the merging algorithm, as we have some invariants when doing cycle detection and checking downstream dependencies. The current problem is that we first form the group partitions and give them incrementing ids. Since they aren't guaranteed to be in top sort order, when we assign the remaining partitions to the ungrouped nodes, their ids will start from where the grouped ids left off, breaking our invariant. This PR assigns ids in order of reverse top sort, where if we encounter a group node we create the partition for the whole group and if we just encounter ungrouped nodes we create the partition just for that node.

Test plan

All previous tests for functionality still pass

Copy link

pytorch-bot bot commented Jul 25, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12871

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job, 1 Unrelated Failure

As of commit 24cfc84 with merge base 66e5591 (image):

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 25, 2025
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

@mcr229 mcr229 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I few comments, let me know your thoughts

return True

def _process_node_groups(
def _process_remaining_nodes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change this to process_all_nodes or something like that

if node in assignment or not self._is_node_supported(node):
continue

if node in processed_nodes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about this check? The purpose was that when traversing nodes in the graph module, we could get two nodes from the same group, and we could potentially add the group twice?

# Update partition map
for node in partition.nodes:
for user in node.users:
target_id = assignment.get(user)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

``

Suggested change
target_id = assignment.get(user)
target_id = assignment.get(user, None)

Just for readability

@leafs1 leafs1 force-pushed the partitionerOptimize branch 3 times, most recently from dd3542a to 988c4b2 Compare July 25, 2025 23:05
@leafs1 leafs1 force-pushed the partitionerOptimize branch from 988c4b2 to 24cfc84 Compare July 25, 2025 23:06
@leafs1 leafs1 merged commit b8fe100 into pytorch:main Jul 26, 2025
97 of 99 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants