onnxruntime_tools optimize a transformer model

Not a bug.
I have model like bert which have same moudle(embed/MultiHeadedAttention). But the forward process and input/output have a little difference. How can I optimize my model by onnxruntime_tools.
This is my try.
```
~/onnxruntime-master/onnxruntime/python/tools/transformers$ python3 optimizer.py --input model/encoder.onnx --output encoder_optim.onnx --model_type bert --num_heads 4 --hidden_size 320
               apply: Fused LayerNormalization count: 12
               apply: Fused SkipLayerNormalization count: 12
         prune_graph: Graph pruned: 0 inputs, 0 outputs and 0 nodes are removed
               apply: Fused SkipLayerNormalization(add bias) count: 12
            optimize: opset verion: 11
  save_model_to_file: Output model to encoder_optim.onnx
get_fused_operator_statistics: Optimized operators:{'EmbedLayerNormalization': 0, 'Attention': 0, 'Gelu': 0, 'FastGelu': 0, 'BiasGelu': 0, 'LayerNormalization': 0, 'SkipLayerNormalization': 12}
                main: The model has been optimized.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnxruntime_tools optimize a transformer model #5113

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

onnxruntime_tools optimize a transformer model #5113

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions