-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ONNX] Deprecate use_external_data_format param from torch.onnx.export() function. #62257
Merged
hwangdeyu
merged 9 commits into
pytorch:onnx_ms_1
from
hwangdeyu:deyu/remove_use_external_data_format
Sep 1, 2021
Merged
[ONNX] Deprecate use_external_data_format param from torch.onnx.export() function. #62257
hwangdeyu
merged 9 commits into
pytorch:onnx_ms_1
from
hwangdeyu:deyu/remove_use_external_data_format
Sep 1, 2021
+172
−58
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
cla signed
oncall: jit
Add this issue/PR to JIT oncall triage queue
labels
Jul 27, 2021
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 1d12c01 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: win-vs2019-cpu-py3 / test (default, 1, 2, windows.4xlarge) (1/3)Step: "Unknown" (full log | diagnosis details | 🔁 rerun)
|
Job | Step | Action |
---|---|---|
pytorch_linux_xenial_py3_6_gcc5_4_jit_legacy_test | Report results | 🔁 rerun |
pytorch_linux_xenial_py3_6_gcc5_4_test | Report results | 🔁 rerun |
ci.pytorch.org: 1 failed
This comment was automatically generated by Dr. CI (expand for details).
Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group.
hwangdeyu
force-pushed
the
deyu/remove_use_external_data_format
branch
2 times, most recently
from
July 27, 2021 14:34
8301178
to
43bc6de
Compare
jiafatom
reviewed
Jul 27, 2021
fatcat-z
reviewed
Aug 6, 2021
fatcat-z
reviewed
Aug 6, 2021
fatcat-z
reviewed
Aug 6, 2021
hwangdeyu
force-pushed
the
deyu/remove_use_external_data_format
branch
from
August 9, 2021 09:48
4053816
to
228b093
Compare
jiafatom
reviewed
Aug 12, 2021
hwangdeyu
force-pushed
the
deyu/remove_use_external_data_format
branch
2 times, most recently
from
August 17, 2021 16:02
17df005
to
38f3756
Compare
jiafatom
reviewed
Aug 19, 2021
BowenBao
force-pushed
the
onnx_ms_1
branch
2 times, most recently
from
August 20, 2021 20:46
7c7074e
to
c500fb6
Compare
hwangdeyu
force-pushed
the
deyu/remove_use_external_data_format
branch
2 times, most recently
from
August 24, 2021 07:31
4d1593a
to
0c1aec8
Compare
jiafatom
reviewed
Aug 24, 2021
jiafatom
reviewed
Aug 26, 2021
This was referenced Sep 1, 2021
Closed
Closed
BowenBao
added a commit
that referenced
this pull request
Sep 1, 2021
….onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 1, 2021
….onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 7, 2021
…param from torch.onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 7, 2021
…t() function. (#62257) * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> ghstack-source-id: 6bb15060d028c0b1c28ecd4aa2360e36e81db4e5 Pull Request resolved: #64382 fix use external data format pr rebase error (#64357) * Fix use_external_data_format PR rebase error. Co-authored-by: hwangdeyu <dejack953@outlook.com> ghstack-source-id: 6bb15060d028c0b1c28ecd4aa2360e36e81db4e5 Pull Request resolved: #64383
BowenBao
added a commit
that referenced
this pull request
Sep 7, 2021
….onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> [ghstack-poisoned]
garymm
added a commit
that referenced
this pull request
Sep 18, 2021
…t() function. (#62257) * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 20, 2021
…t() function. (#62257) * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> ghstack-source-id: e9ca7d7ce3bbfb211b1a43147e3ced3dbd690c86 Pull Request resolved: #64382 fix use external data format pr rebase error (#64357) * Fix use_external_data_format PR rebase error. Co-authored-by: hwangdeyu <dejack953@outlook.com> ghstack-source-id: e9ca7d7ce3bbfb211b1a43147e3ced3dbd690c86 Pull Request resolved: #64383
BowenBao
added a commit
that referenced
this pull request
Sep 20, 2021
…param from torch.onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> Differential Revision: [D30905265](https://our.internmc.facebook.com/intern/diff/D30905265) [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 20, 2021
….onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> Differential Revision: [D30905265](https://our.internmc.facebook.com/intern/diff/D30905265) [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 22, 2021
…t() function. (#62257) * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> ghstack-source-id: 1b5a6bd45e9c83580ab6b7af45b7c715dd74f4c7 Pull Request resolved: #64382 fix use external data format pr rebase error (#64357) * Fix use_external_data_format PR rebase error. Co-authored-by: hwangdeyu <dejack953@outlook.com> ghstack-source-id: 1b5a6bd45e9c83580ab6b7af45b7c715dd74f4c7 Pull Request resolved: #64383
BowenBao
added a commit
that referenced
this pull request
Sep 22, 2021
…param from torch.onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> Differential Revision: [D30905265](https://our.internmc.facebook.com/intern/diff/D30905265) [ghstack-poisoned]
BowenBao
added a commit
that referenced
this pull request
Sep 22, 2021
….onnx.export() function. (#62257)" * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Co-authored-by: hwangdeyu <dejack953@outlook.com> Differential Revision: [D30905265](https://our.internmc.facebook.com/intern/diff/D30905265) [ghstack-poisoned]
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 24, 2021
…t() function. (#62257) (#64382) Summary: Pull Request resolved: #64382 * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905265 Pulled By: malfet fbshipit-source-id: 82b4e17bfa6a8de2bfd700a5282c12f6835603cb Co-authored-by: hwangdeyu <dejack953@outlook.com>
garymm
pushed a commit
to garymm/pytorch
that referenced
this pull request
Oct 1, 2021
…t() function. (pytorch#62257) (pytorch#64382) Summary: Pull Request resolved: pytorch#64382 * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905265 Pulled By: malfet fbshipit-source-id: 82b4e17bfa6a8de2bfd700a5282c12f6835603cb Co-authored-by: hwangdeyu <dejack953@outlook.com>
malfet
pushed a commit
that referenced
this pull request
Oct 8, 2021
* [ONNX] Remove argument _retain_param_name from torch.onnx.export() function. (#61702) (#64370) Summary: Pull Request resolved: #64370 As of now, the "_retain_param_name" parameter has no description in PyTorch docs website. According to code, this argument determines if we keep the original parameter names of PyTorch model in the final ONNX graph. If this is False, those original parameter names will be replaced with a series of integers starting from 1. Since setting numbers as parameter names make no sense to users, we remove this argument from the torch.onnx.export() function to increase user experience of calling this function. This PR will still keep it in torch.onnx.export() function for backward support while all backend logic has been changed to work as _retain_param_name is set to True. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905270 Pulled By: malfet fbshipit-source-id: ca60757ca17daaff937e9f08da42596086795f4a Co-authored-by: fatcat-z <zhang-ji@outlook.com> * [ONNX] Remove strip_doc_string param from torch.onnx.export() function. (#61712) (#64371) Summary: Pull Request resolved: #64371 As of now, the "strip_doc_string" parameter was described as below: strip_doc_string (bool, default True): do not include the field doc_string``` from the exported model. Otherwise the field will mention the source code locations for model``. This is usually useless to users who want to transform a PyTorch model to ONNX one. Only when the user wants to debug the export process, these source code locations could provide benefits. To make the export() function more friendly by providing less parameters, we combined "strip_doc_string" into "verbose" parameter. If a user set verbose to True, it means the users need some log information for debugging the export process and this is similar with the purpose of strip_doc_string parameter. But the usage of these 2 arguments are opposite: setting verbose to True means we want to print log information to help debug, which means strip_doc_string should be False. And this is how we replace strip_doc_string with verbose argument in this PR. This PR will still keep it in torch.onnx.export() function for backward support while the usage of it has been combined with verbose argument. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905268 Pulled By: malfet fbshipit-source-id: 2f06eb805c01fe15ff7a1b4f6595c937ba716d60 Co-authored-by: fatcat-z <zhang-ji@outlook.com> * [ONNX] minor doc improvements and cleanup (#62514) (#64373) Summary: Pull Request resolved: #64373 * Fix some bad formatting and clarify things in onnx.rst. * In `export_to_pretty_string`: * Add documentation for previously undocumented args. * Document that `f` arg is ignored and mark it deprecated. * Update tests to stop setting `f`. * Warn if `_retain_param_name` is set. * Use double quotes for string literals in test_operators.py. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905271 Pulled By: malfet fbshipit-source-id: 3627eeabf40b9516c4a83cfab424ce537b36e4b3 * [ONNX] Deprecated the example_outputs param from torch.onnx.export() function. (#62815) (#64380) Summary: Pull Request resolved: #64380 * `example_outputs` used to determine the type and shape of the outputs without tracing the execution of the model. And it must be provided when exporting a ScriptModule or ScriptFunction when using export() function. * Since we can work out `example_outputs` in internal function instead of being provided by user, so we deprecated this argument in the export() function to increase user experience of calling this function. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905266 Pulled By: malfet fbshipit-source-id: d00b00d7d02b365d165028288ad915678caa51f2 Co-authored-by: hwangdeyu <dejack953@outlook.com> * [ONNX] Deprecate use_external_data_format param from torch.onnx.export() function. (#62257) (#64382) Summary: Pull Request resolved: #64382 * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905265 Pulled By: malfet fbshipit-source-id: 82b4e17bfa6a8de2bfd700a5282c12f6835603cb Co-authored-by: hwangdeyu <dejack953@outlook.com> * fix clang-tidy error introduced by #64382 (#65977) Summary: Pull Request resolved: #65977 Reviewed By: ngimel Differential Revision: D31423174 Pulled By: malfet fbshipit-source-id: 0ea560b9a6ddd6431f70bd3ac10ace68e26ab352 Co-authored-by: BowenBao <bowbao@microsoft.com> Co-authored-by: fatcat-z <zhang-ji@outlook.com> Co-authored-by: hwangdeyu <dejack953@outlook.com>
pytorchmergebot
pushed a commit
that referenced
this pull request
Oct 14, 2023
Fixes #110982 #62257 deprecated `torch.onnx.export(use_external_data_format: bool=...)` argument, but it seems the introduced `EncoderBase::GetGraphProtoSize` has a bug and doesn't detect models > 2GB when onnx Constant nodes are large (and responsible for the size overflow) This PR adds the constant node to the total size of the model, along with initializers. In python, what we need to do is: ```python import onnx def compute_tensor_size(tensor): # Compute the size of the tensor based on its shape and data type size = tensor.size * tensor.itemsize return size def sum_constant_and_initializer_sizes(model_path): # Load the ONNX model model = onnx.load(model_path) total_size = 0 initializer_size = 0 constant_size = 0 # Compute the size of constant nodes for node in model.graph.node: if node.op_type == 'Constant': constant_value = node.attribute[0].t # Convert constant value to numpy array constant_array = onnx.numpy_helper.to_array(constant_value) # Compute the size of the constant tensor tensor_size = compute_tensor_size(constant_array) total_size += tensor_size constant_size += tensor_size # Compute the size of initializer nodes that are not graph inputs for initializer in model.graph.initializer: if initializer.name not in [input.name for input in model.graph.input]: # Convert the shape and data type information to calculate size # tensor = onnx.helper.tensor_value_info_to_tensor(input) tensor = onnx.numpy_helper.to_array(initializer) tensor_size = compute_tensor_size(tensor) total_size += tensor_size initializer_size += tensor_size return total_size, constant_size, initializer_size model_path = '/path/to/model.onnx' total_size, constant_size, initializer_size = sum_constant_and_initializer_sizes(model_path) print("Total size of constant nodes in bytes:", constant_size) print("Total size of initializer nodes (excluding graph inputs) in bytes:", initializer_size) print("Total size of constant and initializer nodes (excluding graph inputs) in bytes:", total_size) ``` Pull Request resolved: #111097 Approved by: https://github.com/justinchuby, https://github.com/zhipenghan
yeounoh
pushed a commit
to yeounoh/pytorch
that referenced
this pull request
Oct 16, 2023
Fixes pytorch#110982 pytorch#62257 deprecated `torch.onnx.export(use_external_data_format: bool=...)` argument, but it seems the introduced `EncoderBase::GetGraphProtoSize` has a bug and doesn't detect models > 2GB when onnx Constant nodes are large (and responsible for the size overflow) This PR adds the constant node to the total size of the model, along with initializers. In python, what we need to do is: ```python import onnx def compute_tensor_size(tensor): # Compute the size of the tensor based on its shape and data type size = tensor.size * tensor.itemsize return size def sum_constant_and_initializer_sizes(model_path): # Load the ONNX model model = onnx.load(model_path) total_size = 0 initializer_size = 0 constant_size = 0 # Compute the size of constant nodes for node in model.graph.node: if node.op_type == 'Constant': constant_value = node.attribute[0].t # Convert constant value to numpy array constant_array = onnx.numpy_helper.to_array(constant_value) # Compute the size of the constant tensor tensor_size = compute_tensor_size(constant_array) total_size += tensor_size constant_size += tensor_size # Compute the size of initializer nodes that are not graph inputs for initializer in model.graph.initializer: if initializer.name not in [input.name for input in model.graph.input]: # Convert the shape and data type information to calculate size # tensor = onnx.helper.tensor_value_info_to_tensor(input) tensor = onnx.numpy_helper.to_array(initializer) tensor_size = compute_tensor_size(tensor) total_size += tensor_size initializer_size += tensor_size return total_size, constant_size, initializer_size model_path = '/path/to/model.onnx' total_size, constant_size, initializer_size = sum_constant_and_initializer_sizes(model_path) print("Total size of constant nodes in bytes:", constant_size) print("Total size of initializer nodes (excluding graph inputs) in bytes:", initializer_size) print("Total size of constant and initializer nodes (excluding graph inputs) in bytes:", total_size) ``` Pull Request resolved: pytorch#111097 Approved by: https://github.com/justinchuby, https://github.com/zhipenghan
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This
use_external_data_format
parameter is used for large models cannot be exported because of the 2GB protobuf limit.When
use_external_data_format
set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself.This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then
use_external_data_format = True
automatically.