Skip to content

Conversation

@BoyuanFeng
Copy link
Contributor

@BoyuanFeng BoyuanFeng commented Oct 28, 2025

This PR adds custom_should_partition_ops: list[str] to specify the name of custom ops upon which graph partition happens. It works with cache since it is a list[str] in the config file. The op name should be of format "mylib::baz".

Close: #165341

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @zou3519 @ProExpertProg

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166458

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (1 Unrelated Failure)

As of commit 0582105 with merge base 5849eea (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I guess thoughts on whether actually just adding the op directly is easier and less ambiguous then the string ?

)

# register ops upon which inductor should partition the graph
custom_should_partition_ops: list[str] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment that the name should be of the form namespace:kernel_name, e.g. aten::mm ?

If we want to users to be able to discriminate based on overload (probably not needed), we could also detect if a . is present in the name then check full name with overload.

Would it be easier to make this of the form op overload/ opoverload packet ? thoughts @zou3519 ?

Copy link
Contributor

@zou3519 zou3519 Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this handle both opoverload and opoverload packet.

@eellison we're passing in the string instead of the operator object because the operator object itself is not serializable

Comment on lines 4928 to 4931
op = ir_node.op_overload
if op is not None and op.name() in config.custom_should_partition_ops:
assert isinstance(op, torch._ops.OpOverload)
return True
Copy link
Contributor

@zou3519 zou3519 Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this immediately break vLLM main x pytorch main, or does vLLM not rely on the deleted logic?

If it breaks vLLM main x pytorch main immediately we're going to want to be a bit careful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a followup pr on vllm side. vllm-project/vllm#27702
It patches the new change and reverts previous workarounds. Tests passed on both pytorch 2.9 and pytorch main.

torch.ops.mylib.baz.default, should_partition
)
# custom_should_partition_ops takes effect which lead to 2 partitions
torch._inductor.config.custom_should_partition_ops = ["mylib::baz"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're doing another test with opoverload, can test ["mylib::baz.default"] in addition to what we have now

== "1"
)

# register ops upon which inductor should partition the graph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oulgen @jamesjwu could you sanity check us? Is everything in the Inductor config included as a part of the FX graph cache key?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, unless you manually opt out of, it will be included by default

@BoyuanFeng
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix registration design for inductor graph partition for vLLM

7 participants