Skip to content

Conversation

@tugsbayasgalan
Copy link
Contributor

@tugsbayasgalan tugsbayasgalan commented Oct 1, 2025

it needs to land after: pytorch/pytorch#164401

All the fullgraph_capture based APIs are moving towards using a common datastructure called GraphCaptureOutput. The idea is that this data structure would be used as source of truth to pull all necessary information to build a runnable compiled artifact. By doing so, we aim to enable various downstream users to build their products using modular components. Today autoparallel is using export directly which carries lot of extra stuff that autoparallel probably doesn't care about. So this PR shows that we can indeed just use a simple lean API instead of export.
Since there might be some churn in the future on the exact API, I abstract it away in a util function.

Some things we need to add are:

  1. Today autoparallel uses export -> this implies all custom types need to be registered to pytree which kinda sucks in terms of UX. But with a leaner API, it is possible to not do that
  2. Export doesn't really use all the dynamo guards, but autoparallel might need them
  3. Today's fqn restoration is rather basic and probably missing some edge cases (especially around handling tensor constants), ideally we can extract it from export but export logic is bit convoluted, so we are starting small.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 1, 2025
@tugsbayasgalan tugsbayasgalan requested review from ezyang and xmfan October 1, 2025 20:44
3) Be more careful about tensor constants names.
"""
with torch._dynamo.config.patch(install_free_tensors=True):
gm = _dynamo_graph_capture_for_export(model)(*inputs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is used for new strict export and it adds implicitly sets up right calling convention using pytree. We might need to change this a bit to rely on bytecode instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""
with torch._dynamo.config.patch(install_free_tensors=True):
gm = _dynamo_graph_capture_for_export(model)(*inputs)
_restore_state_dict(model, gm)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autoparallel wants the original fqn to be stored.

@xmfan
Copy link
Member

xmfan commented Oct 2, 2025

rebase for the ci fixes

@tugsbayasgalan tugsbayasgalan merged commit f1887eb into main Oct 10, 2025
6 of 7 checks passed
tugsbayasgalan added a commit that referenced this pull request Oct 10, 2025
tugsbayasgalan added a commit that referenced this pull request Oct 10, 2025
@fmassa fmassa deleted the tugsuu_export branch October 11, 2025 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants