Share more constant initializers#15461
Merged
Merged
Conversation
(cherry picked from commit a4b0ea8)
…pengwa/const_share
askhade
reviewed
Apr 11, 2023
| using SupportedTypeList = boost::mp11::mp_list<MLFloat16, float, double, int32_t, int64_t>; | ||
|
|
||
| bool IsValidSingleValueShape(const ONNX_NAMESPACE::TensorShapeProto* input_shape) { | ||
| static constexpr int64_t MAX_SIZE_PER_VALUE = 8; |
Contributor
There was a problem hiding this comment.
why do we have this restriction on num of elements?
Contributor
Author
There was a problem hiding this comment.
Having a bigger tensor element size threshold here means more overhead running ConstantSharing graph transformation. Originally the number is 1, now I changed it to 8 gradually. Maybe we can make it bigger once we found it helps for some specific scenarios.
Contributor
There was a problem hiding this comment.
Can you add some cooments around the reasoning behind choosing 8 and what should one consider if they want to change this or remove this limitation altogether in future. Thanks!
Contributor
Author
There was a problem hiding this comment.
Sure, I added a comment for it. :)
…pengwa/const_share
…pengwa/const_share
askhade
approved these changes
Apr 14, 2023
wejoncy
pushed a commit
that referenced
this pull request
Apr 18, 2023
### Minor fix for differently scoped cpu_ep
cpu_ep is under `#ifndef DISABLE_CONTRIB_OPS`, but one of its usage is
not under the same condition.
```
#ifndef DISABLE_CONTRIB_OPS
const InlinedHashSet<std::string_view> cpu_ep = {onnxruntime::kCpuExecutionProvider};
#endif
```
### Motivation and Context
Postmoterm: #15461 passed
all CIs except Linux/Windows TVM CIs. I did not check the detailed error
message then because they are failed for some reason for a few days at
least. While checking the details, after PR 15461, the error messge
changes from
Before constant sharing change: TVM CI error message:
```
https://github.com/microsoft/onnxruntime/actions/runs/4700368634/jobs/8334955814
ERROR: testBooleanInputs (__main__.TestInferenceSession)
----------------------------------------------------------------------
Traceback (most recent call last):
File "onnxruntime_test_python.py", line 617, in testBooleanInputs
sess = onnxrt.InferenceSession(get_name("logicaland.onnx"), providers=available_providers)
File "D:\a\onnxruntime\onnxruntime\build\Release\Release\onnxruntime\capi\onnxruntime_inference_collection.py", line 383, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "D:\a\onnxruntime\onnxruntime\build\Release\Release\onnxruntime\capi\onnxruntime_inference_collection.py", line 435, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\a\onnxruntime\onnxruntime\onnxruntime\core\providers\tvm\tvm_api.cc:49 onnxruntime::tvm::TVMCompile compile != nullptr was false. Unable to retrieve 'tvm_onnx_import_and_compile'.
```
to
```
D:\a\onnxruntime\onnxruntime\onnxruntime\core\optimizer\graph_transformer_utils.cc(213,67): error C2065: 'cpu_ep': undeclared identifier [D:\a\onnxruntime\onnxruntime\build\Release\onnxruntime_optimizer.vcxproj]
D:\a\onnxruntime\onnxruntime\onnxruntime\core\optimizer\graph_transformer_utils.cc(213,19): error C2672:
```
This PR fixes the build the issue, The error message of Windows/Linux
TVM CIs are back to the original ones.
12 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Share more constant initializers.
ConstantSharingtransformer originally only handle single value initializer (scalar or 1D).This PR tried to share more cases to make common subexpression elimination transformer to remove more duplicated nodes.
Originally, we used a single vector<std::variant<float,half,int32,int64>> to store different scalar values. In this PR, we create a unordered map with its key being data_type + rank + element count, and its value is a vector of
InitializerValue.For one specific initializer, if it fulfils the condition, then finally will find the corresponding vector of
InitializerValueby its <data_type + rank + element count>, then search from the vector whether the constant tensor already exist or not. After that, a value id is returned, which will be combined together with <data_type + rank + element count> to form the pattern key to decide which tensor to reuse (legacy code).Motivation and Context
One example we see here is:
stateDiagram [*] --> LayerNorm(b,s,64) LayerNorm(b,s,64) --> Reshape1 Shape1_Const[b*s,64] --> Reshape1 LayerNorm(b,s,64) --> Reshape2 Shape2_Const[b*s,64] --> Reshape2 Reshape1 --> AttentionSubGraph Reshape2 --> Add AttentionSubGraph--> Add Add --> [*]Ideally CommonSubexpressionElimination can remove one of
Reshape1andReshape2, while sinceShape1_ConstandShape2_Constare different NodeArg*, so it did not remove the duplication.This is an example: removing the duplication will bring more opportunities to apply graph transformations.