feature: adding the ability to restore shapes after loading a traced model #90744

mnuyens · 2022-12-13T02:31:34Z

Adds the ability to store inputs used in tracing models when calling torch.jit.save and restore the input shapes using torch.jit.load if the appropriate variables are set.

Fixes 89185

cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen

pytorch-bot · 2022-12-13T02:31:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90744

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 60263cc:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2022-12-13T02:31:38Z

The committers listed above are authorized under a signed CLA.

✅ login: mnuyens / name: Maxwell Nuyens (45344db)

davidberard98

Left some comments.

What's the use case for this change? Shapes on traced graphs are, afaik, just an artifact of the way the graph is produced that's useful for debugging; they don't provide any guards or guarantees. Carrying around copies of the input data seems like a pretty large price to be able to have this information.

Also, I wonder if we can just serialize the shapes (or Type pointers) directly instead of serializing IValues? This would allow us to (a) avoid modifying the jit::Module interface to have to store values, and (b) store only the metadata we need, instead of all the data that is associated with the inputs

davidberard98 · 2022-12-15T01:08:46Z

torch/csrc/jit/serialization/import.cpp

  rewriteQuantizedConvForBC(m);
+  // Checking for and loading saved traced inputs
+  if (reader_->hasRecord("traced_inputs.pkl")) {


nit: imo we shouldn't attempt this if restore_shapes=False; if there's errors in this step, there should be a way to skip it.

Makes sense, I'll fix that in the next commit.

davidberard98 · 2022-12-15T01:15:02Z

torch/jit/_trace.py

@@ -908,6 +911,7 @@ def trace_module(
    _module_class=None,
    _compilation_unit=_python_cu,
    example_inputs_is_kwarg=False,
+    _store_inputs=True,


I think it's better to default to False here:

I would guess that most users don't need this behavior

The overhead for saving these could be pretty large if input sizes are large.

I'd like to default to on if possible to make it so people who want to use a tool that accesses this information wouldn't need to retrace or change their current experience.

I've made changes so that the overhead should only be ~400 bytes per tensor in the input after serializing the model

sorry, but I still prefer to default to false here. is there a reason it needs to be true?

nvm see your description below. let me ask around on this

hmmmm I see the loading of it default to false so its fine... the raw file size on disk is less important than the memory consumption of the loaded model.

davidberard98 · 2022-12-15T01:55:12Z

test/jit/test_save_load.py

@@ -550,6 +550,50 @@ def forward(self, x):
        self.assertTrue(m_buffers["buffer"].is_meta)
        self.assertTrue(m_loaded_buffers["buffer"].is_meta)

+    def test_save_load_with_saved_traced_inputs(self):


can you also test some forward functions that take container types (tuples, lists) as inputs? I'm a bit worried about how setInputTensorTypes handles container types. Please add variable input sizes, etc.

Also, what is the expectation for how container types behave here? e.g. should a list of tensors be annotated as a List[Tensor{dtype, size info}] or as a List[generic-Tensor]? What if the contained tensor metadata don't match each other?

did you add these tests? I don't see them

I added them to the same test, I've gone ahead and added some comments to make that more clear.

davidberard98 · 2022-12-15T18:07:45Z

Alternatively, you are also free to implement this on top of pytorch for your own use cases; you can create wrappers for import / export that dump input types (during export) and load & apply the shapes (during export).

mnuyens · 2022-12-20T21:53:15Z

Hi David,

Thanks for the comments, I'll start working on most of the changes you suggested.

The use case I'm looking for here is to allow for a tool I work with to read the shape information from the saved traced models to simplify the customer experience. I'm also looking to not require people to know that they are using this tool before tracing if possible which is why I want to default on for the saving of the information. I'll see what I can do in regards to only saving the shape to make this a less expensive operation.

davidberard98

left comments on how to fix the bazel build

davidberard98 · 2023-02-09T21:38:37Z

torch/csrc/jit/ir/graph_utils.h

+namespace torch {
+namespace jit {
+
+TypePtr getTensorType(const at::Tensor& t, bool complete);


I think you need to prefix these with TORCH_API (i.e. prefix all the function definitions with TORCH_API)

Updated in the latest version

davidberard98 · 2023-02-09T21:38:55Z

build_variables.bzl

@@ -822,6 +823,7 @@ libtorch_python_core_sources = [
    "torch/csrc/dynamo/init.cpp",
    "torch/csrc/functorch/init.cpp",
    "torch/csrc/jit/backends/backend_init.cpp",
+    "torch/csrc/jit/ir/graph_utils.cpp",


I think you should remove this to fix the bazel build issue

updated in the latest version

model

davidberard98

LGTM!

davidberard98 · 2023-02-09T23:04:56Z

@pytorchbot merge

pytorchmergebot · 2023-02-09T23:06:43Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-02-10T05:05:17Z

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

mnuyens · 2023-02-10T17:10:11Z

@pytorchbot merge

pytorchmergebot · 2023-02-10T17:12:45Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

… traced model (pytorch#90744)" This reverts commit 0d0ebcd.

davidberard98 · 2023-02-25T00:44:46Z

@huydhn do you know why this was reverted?

huydhn · 2023-02-25T01:01:32Z

@huydhn do you know why this was reverted?

Oh, this is not reverted in trunk. I was trying to debug this issue #89395, and reverted the commit on my test PR. That reverted commit message shows up here "misleadingly".

davidberard98 · 2023-02-25T01:07:06Z

oh I see, thanks :) I wasn't able to find the internal diff but now I've found it.

pytorch-bot bot added the release notes: jit release notes category label Dec 13, 2022

pytorchbot added the open source label Dec 13, 2022

zou3519 requested a review from davidberard98 December 15, 2022 00:26

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 15, 2022

davidberard98 reviewed Dec 15, 2022

View reviewed changes

mnuyens force-pushed the master branch from d7087c1 to 45344db Compare February 8, 2023 06:33

github-actions bot added the module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration label Feb 8, 2023

mnuyens force-pushed the master branch 2 times, most recently from c84abc3 to 62e6ab1 Compare February 9, 2023 19:28

davidberard98 reviewed Feb 9, 2023

View reviewed changes

feature: adding the ability to restore shapes after loading a traced

60263cc

model

mnuyens force-pushed the master branch from 62e6ab1 to 60263cc Compare February 9, 2023 21:57

davidberard98 approved these changes Feb 9, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 9, 2023

pytorchmergebot added the Merged label Feb 10, 2023

pytorchmergebot closed this in 0d0ebcd Feb 10, 2023

huydhn added a commit to huydhn/pytorch that referenced this pull request Feb 11, 2023

Revert "feature: adding the ability to restore shapes after loading a…

cdcf931

… traced model (pytorch#90744)" This reverts commit 0d0ebcd.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: adding the ability to restore shapes after loading a traced model #90744

feature: adding the ability to restore shapes after loading a traced model #90744

mnuyens commented Dec 13, 2022 •

edited by pytorch-bot bot

pytorch-bot bot commented Dec 13, 2022 •

edited

linux-foundation-easycla bot commented Dec 13, 2022 •

edited

davidberard98 left a comment

davidberard98 Dec 15, 2022

mnuyens Dec 20, 2022

davidberard98 Dec 15, 2022

mnuyens Dec 20, 2022

mnuyens Feb 8, 2023

davidberard98 Feb 9, 2023

davidberard98 Feb 9, 2023

qihqi Feb 9, 2023

davidberard98 Dec 15, 2022

mnuyens Dec 20, 2022

davidberard98 Feb 9, 2023

mnuyens Feb 9, 2023

davidberard98 commented Dec 15, 2022

mnuyens commented Dec 20, 2022

davidberard98 left a comment

davidberard98 Feb 9, 2023 •

edited

mnuyens Feb 9, 2023

davidberard98 Feb 9, 2023

mnuyens Feb 9, 2023

davidberard98 left a comment

davidberard98 commented Feb 9, 2023

pytorchmergebot commented Feb 9, 2023

pytorchmergebot commented Feb 10, 2023

mnuyens commented Feb 10, 2023

pytorchmergebot commented Feb 10, 2023

davidberard98 commented Feb 25, 2023

huydhn commented Feb 25, 2023 •

edited

davidberard98 commented Feb 25, 2023

feature: adding the ability to restore shapes after loading a traced model #90744

feature: adding the ability to restore shapes after loading a traced model #90744

Conversation

mnuyens commented Dec 13, 2022 • edited by pytorch-bot bot

pytorch-bot bot commented Dec 13, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90744

❗ 2 Active SEVs

✅ No Failures

linux-foundation-easycla bot commented Dec 13, 2022 • edited

davidberard98 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidberard98 commented Dec 15, 2022

mnuyens commented Dec 20, 2022

davidberard98 left a comment

Choose a reason for hiding this comment

davidberard98 Feb 9, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidberard98 left a comment

Choose a reason for hiding this comment

davidberard98 commented Feb 9, 2023

pytorchmergebot commented Feb 9, 2023

Merge started

pytorchmergebot commented Feb 10, 2023

mnuyens commented Feb 10, 2023

pytorchmergebot commented Feb 10, 2023

Merge started

davidberard98 commented Feb 25, 2023

huydhn commented Feb 25, 2023 • edited

davidberard98 commented Feb 25, 2023

mnuyens commented Dec 13, 2022 •

edited by pytorch-bot bot

pytorch-bot bot commented Dec 13, 2022 •

edited

linux-foundation-easycla bot commented Dec 13, 2022 •

edited

davidberard98 Feb 9, 2023 •

edited

huydhn commented Feb 25, 2023 •

edited