Migrate to strict mode export (dynamo) to support AC tags and HOPs #93

xmfan · 2025-08-12T00:48:58Z

examples/example_autoparallel.py

ezyang

Checking with @avikchaudhuri @tugsbayasgalan to make sure this doesn't make the eventual pre compile rework harder

autoparallel/api.py

ezyang

I'm OK for the use of export to torch ir. I'll leave it up to fmassa when we should merge this (e.g., does deepcopy need to be fixed first)

tugsbayasgalan · 2025-08-15T03:43:13Z

Checking with @avikchaudhuri @tugsbayasgalan to make sure this doesn't make the eventual pre compile rework harder

Since we just plan to swap out export_to_torch_ir with the eventual correct API that also works with precompile , i think this change is fine.

fmassa · 2025-08-19T15:13:57Z

@xmfan I was looking into implementing the deepcopy for our parametrization so that we can get this merged, but I realized that there might be other issues in here.

The captured graph from _export_to_torch_ir gives us a graph like the following for the example_autoparallel.py code, when adding back mixed precision

graph():
    %l_args_0_ : [num_users=1] = placeholder[target=arg0]
    %to : [num_users=2] = call_method[target=to](args = (%l_args_0_, torch.bfloat16), kwargs = {})
    %l__self___wq_weight : [num_users=1] = get_attr[target=L__self___wq_weight]
    %l__self___wk_weight : [num_users=1] = get_attr[target=L__self___wk_weight]
    %l__self___wv_weight : [num_users=1] = get_attr[target=L__self___wv_weight]
    %l__self___wo_weight : [num_users=1] = get_attr[target=L__self___wo_weight]
    %wrap_body_0 : [num_users=1] = get_attr[target=wrap_body_0]
    %tag_activation_checkpoint : [num_users=1] = call_function[target=torch.ops.higher_order.tag_activation_checkpoint](args = (%wrap_body_0, %to, %l__self___wq_weight, %l__self___wk_weight, %l__self___wv_weight, %l__self___wo_weight), kwargs = {use_reentrant: False})
    %o : [num_users=1] = call_function[target=operator.getitem](args = (%tag_activation_checkpoint, 0), kwargs = {})
    %o0 : [num_users=2] = call_function[target=operator.add](args = (%o, %to), kwargs = {})
    %l__self___w1 : [num_users=1] = get_attr[target=L__self___w1]
    %o_1 : [num_users=1] = call_method[target=forward](args = (%l__self___w1, %o0), kwargs = {})
    %o_2 : [num_users=1] = call_function[target=torch.nn.functional.relu](args = (%o_1,), kwargs = {})
    %l__self___w2 : [num_users=1] = get_attr[target=L__self___w2]
    %o_3 : [num_users=1] = call_method[target=forward](args = (%l__self___w2, %o_2), kwargs = {})
    %o_4 : [num_users=1] = call_function[target=operator.add](args = (%o0, %o_3), kwargs = {})
    %output : [num_users=1] = call_method[target=to](args = (%o_4, torch.bfloat16), kwargs = {})
    return [output]

You can see that the input and output cast get properly called, but the weight parametrization disappear.
My guess is that this is because the weight names have been renamed (like L__self__wq_weight), so we never call into the parametrization.

Am I reading this right?

xmfan · 2025-08-19T20:33:09Z

@fmassa I see the casts properly if I enable torch._dynamo.config.install_free_tensors=True: https://gist.github.com/xmfan/5176d488358e77943dbca0ecd6fe0005. This was caused by a divergence between torch.export dynamo and torch.compile dynamo: https://github.com/pytorch/pytorch/blob/eba20d2d748cb17dce9aa26e5513e4567bfd8282/torch/_dynamo/variables/builder.py#L1881-L1901

fmassa · 2025-08-20T08:31:43Z

I validated that with this change we get back the dtype cast hooks.

But looks like test failures are related, and this seems to be breaking init_weights for now.

Also, am I understanding this right that this PR will also make AutoParallel only support tuple inputs?

xmfan · 2025-08-21T00:49:09Z

For the init_weights, dynamo encodes the fqn a certain way, self.linear.weight -> self____modules__linear____parameters__weight, iirc we want the fqns to match eager exactly for model saving/loading purposes? @fmassa

autoparallel/api.py

fmassa · 2025-08-21T09:54:44Z

For the init_weights, dynamo encodes the fqn a certain way, self.linear.weight -> self____modules__linear____parameters__weight, iirc we want the fqns to match eager exactly for model saving/loading purposes? @fmassa

@xmfan yes, we would want to keep the same FQN for saving / loading.

xmfan · 2025-08-21T16:12:07Z

Scrap scrap that. It's too brittle to manually restore FQNs. Keep using the export stuff.

~~Scrap this. Apparently it's okay, people are working on AOT precompile and we'll just move to that.~~

~~I synced with @tugsbayasgalan, there's some refactors upcoming to export so it'd be best if we stuck to public APIs. Given that strict=False works for trunk, there's probably not that much gap.~~

~~So I'm breaking this PR up:~~
~~1- I'll ensure that torch.export(strict=True) works for what we currently have landed in autoparallel: #104~~
~~2- Tugsuu is looking at getting the AC HOP to proxy itself during pre-dispatch, the tests in this PR should pass after~~

fmassa

LGTM once tests pass, thanks for working on it!

Also, could you add some more comments on the need of the monkey_patch?

fmassa · 2025-08-28T08:37:27Z

examples/example_autoparallel.py

-# mp_policy = MixedPrecisionPolicy(param_dtype=torch.bfloat16)
+# MP policy causing some deepcopy issues
+# mp_policy = MixedPrecisionPolicy(param_dtype=torch.bfloat16, reduce_dtype=torch.float32)
+mp_policy = MixedPrecisionPolicy(param_dtype=torch.bfloat16)


Does the code also works with the reduce_dtype=torch.float32? If yes, can you set it the default? That is the setup we mostly use so it would be good to have it as the default

yes works, i'll update these comments

fmassa · 2025-08-28T08:41:55Z

autoparallel/api.py



+@contextmanager
+def monkey_patch_export_verifier():


Can you explain a bit (in code as well) why we currently need this?

Is this something that you expect we will remove in the future or is it something that is meant to stay?

added comment, it's something we will remove in the future, either when:

export offers a mode to drop the serializability constraints

precompile frontend

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 12, 2025

xmfan requested review from ezyang, fmassa and wconstab August 12, 2025 01:26

xmfan marked this pull request as ready for review August 12, 2025 01:26

fmassa reviewed Aug 12, 2025

View reviewed changes

examples/example_autoparallel.py Outdated Show resolved Hide resolved

ezyang reviewed Aug 12, 2025

View reviewed changes

ezyang reviewed Aug 13, 2025

View reviewed changes

autoparallel/api.py Outdated Show resolved Hide resolved

ezyang reviewed Aug 13, 2025

View reviewed changes

autoparallel/api.py Outdated Show resolved Hide resolved

ezyang approved these changes Aug 13, 2025

View reviewed changes

xmfan force-pushed the xmfan/ac_tagging branch from 53c773c to 839b69f Compare August 19, 2025 17:14

xmfan force-pushed the xmfan/ac_tagging branch from 839b69f to f799148 Compare August 19, 2025 20:56

xmfan changed the title ~~Directly use _export_to_torch_ir strict_mode to support AC tags~~ Migrate to dynamo frontend (_export_to_torch_ir) to support AC tags Aug 21, 2025

xmfan commented Aug 21, 2025

View reviewed changes

autoparallel/api.py Outdated Show resolved Hide resolved

xmfan mentioned this pull request Aug 21, 2025

Strict mode export #104

Closed

xmfan force-pushed the xmfan/ac_tagging branch from 36edeb4 to 300608f Compare August 26, 2025 07:04

xmfan changed the title ~~Migrate to dynamo frontend (_export_to_torch_ir) to support AC tags~~ Migrate to strict mode export to support AC tags Aug 26, 2025

xmfan changed the title ~~Migrate to strict mode export to support AC tags~~ Migrate to strict mode export (dynamo) to support AC tags Aug 26, 2025

xmfan changed the title ~~Migrate to strict mode export (dynamo) to support AC tags~~ Migrate to strict mode export (dynamo) to support AC tags and HOPs Aug 26, 2025

fmassa approved these changes Aug 28, 2025

View reviewed changes

xmfan force-pushed the xmfan/ac_tagging branch from e2a6491 to 1a8463f Compare August 28, 2025 17:34

xmfan added 2 commits August 28, 2025 16:19

Directly use _export_to_torch_ir strict_mode to support AC tags

450c55c

clean

cbd5f41

xmfan added 6 commits August 28, 2025 16:19

rebase

2a882fb

lint

a998fec

works with pytorch wip prs

974589e

monkey patch strict mode verifier to accept dtype cast ops

f5dc18f

mp fp32

1406176

comments on verifier patch

997920d

xmfan force-pushed the xmfan/ac_tagging branch from 1a8463f to 997920d Compare August 28, 2025 23:19

xmfan merged commit bf39515 into main Aug 28, 2025
6 checks passed

fmassa deleted the xmfan/ac_tagging branch August 29, 2025 08:08

fmassa mentioned this pull request Aug 29, 2025

[WIP] Try fix AC tag propagation #82

Closed



		@contextmanager
		def monkey_patch_export_verifier():

Migrate to strict mode export (dynamo) to support AC tags and HOPs #93

Migrate to strict mode export (dynamo) to support AC tags and HOPs #93

Uh oh!

Conversation

xmfan commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

tugsbayasgalan commented Aug 15, 2025

Uh oh!

fmassa commented Aug 19, 2025

Uh oh!

xmfan commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa commented Aug 20, 2025

Uh oh!

xmfan commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

fmassa commented Aug 21, 2025

Uh oh!

xmfan commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

fmassa Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

xmfan Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

fmassa Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

xmfan Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

xmfan commented Aug 12, 2025 •

edited

Loading

xmfan commented Aug 19, 2025 •

edited

Loading

xmfan commented Aug 21, 2025 •

edited

Loading

xmfan commented Aug 21, 2025 •

edited

Loading