Save/Load OptimizedModule #101651

msaroufim · 2023-05-17T01:04:36Z

This enables us to torch.save(opt_mod) by just unwrapping and saving the _orig_mod

torch.load("opt_model.pt") is semantically equivalent to loading an uncompiled model, we can make this better by saving a cache or use AOT Inductor

So torch.save(mod) == torch.save(opt_mod) and torch.load(opt_model) == torch.load(mod)

So users don't need to write code like this

if is_compiled_module(model):
    torch.save(model._orig_mod)

Idea 0 (rejected)

Don't bother saving an optimized model, just throw and error but make it human readable in that people need to torch.save(opt_model.state_dict()) #101997

Idea 1 (rejected)

Instantiate an nn module instead of an optimizedmodule, it works but it's confusing and breaks deepcopying

    def __getstate__(self):
        return {"_orig_mod": self._orig_mod}

    def __setstate__(self, state):
        orig_mod = state["_orig_mod"]
        self.__class__ = orig_mod.__class__
        self.__dict__.update(orig_mod.__dict__)

    def __new__(cls, *args, **kwargs):
        instance = super().__new__(cls)
        super(OptimizedModule, instance).__init__()
        return instance

Idea 2 (this PR now)

A lot more work but we can use wrapper classes around functions instead so they become picklable. Need to merge this fast though cause the merge conflicts make me sad

Test pass on modules test/dynamo/test_modules.py

Failing test tracker

  dynamo/test_logging
  dynamo/test_dynamic_shapes
  dynamo/test_misc

cc @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @aakhundov @desertfire

pytorch-bot · 2023-05-17T01:04:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/101651

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 21 New Failures

As of commit 18cc651:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

albanD · 2023-05-17T19:35:32Z

torch/_dynamo/eval_frame.py

+
+    def __new__(cls, *args, **kwargs):
+        instance = super().__new__(cls)
+        super(OptimizedModule, instance).__init__()


You should definitely not do that haha

torch/_dynamo/eval_frame.py

jansel

Silently converting one type to another in pickle load is confusing, let's not do that.

msaroufim · 2023-06-13T16:15:02Z

test/dynamo/test_modules.py

@@ -1435,6 +1438,19 @@ def test_recursion(self):
        opt_mod(torch.randn(10, 10))
        self.assertEqual(cnt.frame_count, 1)

+    def test_save_and_load(self):
+        mod = MockModule()
+        cnt = torch._dynamo.testing.CompileCounter()


note from @albanD: check if it works with inductor as well. It's possible that this backend has a simple state

github-actions · 2023-08-12T21:33:42Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

msaroufim · 2023-08-13T22:34:31Z

Closing this for now. I couldn't figure out why this changes introduces extra graph breaks. Might revisit

## save&load support for OptimizedModule [Issue Description](#101651) English is not my native language; please excuse typing errors. This pr is based on commit b958810\ I'll do something with the merge conflicts later ### test result for test/dynamo Conclusion:\ It performs the same as before as far as I can see. ENV(CPU only):\ platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0\ configfile: pytest.ini\ plugins: anyio-3.7.1, cpp-2.3.0, flakefinder-1.1.0, xdist-3.3.1, xdoctest-1.1.0, metadata-3.1.1, html-4.1.1, hypothesis-5.35.1, rerunfailures-14.0 #### before this pr: [before](https://github.com/pytorch/pytorch/files/15329370/before.md) #### after this pr: [after](https://github.com/pytorch/pytorch/files/15329376/after.md) ### some changes 1. add test_save_and_load to test/dynamo/test_modules.py with & without "backend='inductor'" 2. add \_\_reduce\_\_ function to OptimizedModule and derived classes of _TorchDynamoContext for pickling & unpickling 3. change the wrappers into wrapper classes ( including convert_frame_assert, convert_frame, catch_errors_wrapper in torch/_dynamo/convert_frame.py & wrap_backend_debug in torch/_dynamo/repro/after_dynamo.py ) 4. change self.output.compiler_fn into innermost_fn(self.output.compiler_fn) in torch/_dynamo/symbolic_convert.py to get the origin compiler_fn and to avoid the "compiler_fn is not eager" condition Pull Request resolved: #126374 Approved by: https://github.com/msaroufim, https://github.com/jansel

Save/Load OptimizedModule

b293390

github-actions bot added ciflow/inductor module: dynamo labels May 17, 2023

msaroufim added the topic: not user facing topic category label May 17, 2023

msaroufim added 4 commits May 17, 2023 01:09

simpl

1118439

fix test_repros.py

f3ff9db

add tmp directories

831ca57

lint

21b5ff4

albanD reviewed May 17, 2023

View reviewed changes

msaroufim mentioned this pull request May 22, 2023

Readable Error if torch.save(opt_model) #101997

Closed

jansel requested changes May 25, 2023

View reviewed changes

msaroufim added 6 commits May 25, 2023 19:01

pickle dynamo context

b75a237

ALL THE CLASSES

17abd0a

save test now passes

a153b9e

Fix merge conflict

43965b7

lint

df49850

Merge remote-tracking branch 'origin/main' into msaroufim/saveoptmodule

35dc931

pytorch deleted a comment from pytorchmergebot Jun 1, 2023

msaroufim added 2 commits June 1, 2023 04:26

Trigger Build

bdb1074

Merge branch 'main' into msaroufim/saveoptmodule

11e368e

jansel mentioned this pull request Jun 4, 2023

Get the error: AttributeError: Can't pickle local object 'convert_frame.<locals>._convert_frame' #93470

Open

msaroufim commented Jun 13, 2023

View reviewed changes

Merge branch 'main' into msaroufim/saveoptmodule

18cc651

msaroufim mentioned this pull request Jun 15, 2023

Compiled Graph Cache for Inductor #101027

Closed

github-actions bot added the Stale label Aug 12, 2023

msaroufim closed this Aug 13, 2023

msaroufim mentioned this pull request Aug 29, 2023

Make compiled models serializable #101107

Open

weiyusheng pushed a commit to weiyusheng/pytorch that referenced this pull request May 9, 2024

fix bug from pytorch#101651

3e4f56c

weiyusheng mentioned this pull request May 16, 2024

Opt model save and load #126374

Closed

weiyusheng added a commit to weiyusheng/pytorch that referenced this pull request Jun 3, 2024

[dynamo] Solve Save/Load OptimizedModule pytorch#101651

345f2b1

weiyusheng added a commit to weiyusheng/pytorch that referenced this pull request Jun 4, 2024

[dynamo] Solve Save/Load OptimizedModule pytorch#101651

149e608

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save/Load OptimizedModule #101651

Save/Load OptimizedModule #101651

msaroufim commented May 17, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented May 17, 2023 •

edited

Loading

albanD May 17, 2023

jansel left a comment

msaroufim Jun 13, 2023 •

edited

Loading

github-actions bot commented Aug 12, 2023

msaroufim commented Aug 13, 2023

Save/Load OptimizedModule #101651

Save/Load OptimizedModule #101651

Conversation

msaroufim commented May 17, 2023 • edited by pytorch-bot bot Loading

Idea 0 (rejected)

Idea 1 (rejected)

Idea 2 (this PR now)

pytorch-bot bot commented May 17, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/101651

❌ 21 New Failures

albanD May 17, 2023

Choose a reason for hiding this comment

jansel left a comment

Choose a reason for hiding this comment

msaroufim Jun 13, 2023 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Aug 12, 2023

msaroufim commented Aug 13, 2023

msaroufim commented May 17, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented May 17, 2023 •

edited

Loading

msaroufim Jun 13, 2023 •

edited

Loading