Skip to content

onnxruntime_tools optimize a transformer model #5113

@Chen1399

Description

@Chen1399

Not a bug.
I have model like bert which have same moudle(embed/MultiHeadedAttention). But the forward process and input/output have a little difference. How can I optimize my model by onnxruntime_tools.
This is my try.

~/onnxruntime-master/onnxruntime/python/tools/transformers$ python3 optimizer.py --input model/encoder.onnx --output encoder_optim.onnx --model_type bert --num_heads 4 --hidden_size 320
               apply: Fused LayerNormalization count: 12
               apply: Fused SkipLayerNormalization count: 12
         prune_graph: Graph pruned: 0 inputs, 0 outputs and 0 nodes are removed
               apply: Fused SkipLayerNormalization(add bias) count: 12
            optimize: opset verion: 11
  save_model_to_file: Output model to encoder_optim.onnx
get_fused_operator_statistics: Optimized operators:{'EmbedLayerNormalization': 0, 'Attention': 0, 'Gelu': 0, 'FastGelu': 0, 'BiasGelu': 0, 'LayerNormalization': 0, 'SkipLayerNormalization': 12}
                main: The model has been optimized.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions