Generalize reshape fusion#3554
Conversation
* Allow arbitrary number of Concat arguments * Apply fusion even when an output of an internal node is used elsewhere * Fix a bug when an internal node's output is the subgraph output * Simplify code
| shape_value.reserve(concat_input_count); | ||
| // Used to keep the following nodes in the order of their potential removal. | ||
| enum class NodeType { Unsqueeze, Gather, Shape }; | ||
| std::set<std::pair<NodeType, NodeIndex>> candidates_for_removal; |
There was a problem hiding this comment.
candidates_for_removal can be std::vector<NodeIndex>. Then append node index of unsqueeze, gather, shape nodes in the order.
There was a problem hiding this comment.
This will not work correctly in case of sharing, e.g. when different Gather nodes refer to the same Shape node.
There was a problem hiding this comment.
That's right. That need check node index does not exist in the vector before appending (to avoid duplicated node index).
There was a problem hiding this comment.
Then the vector will be in the order [unsqueeze1, gather1, shared_shape, unsqueeze2, gather2], which is incorrect because we want to remove nodes in reverse topological order (unsqueeze2 and gather2 must come before shared_shape).
|
/azp run Linux CPU CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,Win CPU x86 CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,Win CPU x64 NoContribops CI Pipeline,MacOS NoContribops CI Pipeline,Linux CPU x64 NoContribops CI Pipeline |
|
Azure Pipelines successfully started running 8 pipeline(s). |
f5e7e60
|
/azp run Linux CPU CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,MacOS NoContribops CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Windows CPU CI Pipeline |
|
Azure Pipelines successfully started running 9 pipeline(s). |
|
|
||
| if (concat_input_count > 3) { | ||
| if (!optimizer_utils::AppendTensorFromInitializer(graph, *(concat.InputDefs()[3]), shape_value)) { | ||
| if (!optimizer_utils::IsInitializerWithExpectedValue(graph, *(gather.InputDefs()[1]), int64_t(i), false)) { |
There was a problem hiding this comment.
The expected value should be shape_value.size() instead of i because the tensor appended in line 97 could have multiple elements. @MaximKalininMS, could you provide a fix?
microsoft#3554 introduced a bug: initializers can now come before Shape->Gather->Unsqueeze chains; if those initializers have more than 1 element, expected dimensions in the chains are now incorrect.
* Fix a crash in Reshape Reshape doesn't handle 0 input dimension properly, which leads to a division by zero * Fix reshape fusion #3554 introduced a bug: initializers can now come before Shape->Gather->Unsqueeze chains; if those initializers have more than 1 element, expected dimensions in the chains are now incorrect. Authored-by: Max Kalinin <makalini@microsoft.com>
Description: Generalize reshape fusion.
Motivation and Context
Currently, reshape fusion works with a very specific graph pattern. This change will:
See source code comments for details.