Support for converting LayoutLM to ONNX #13300

nishprabhu · 2021-08-27T12:33:50Z

🚀 Feature request

Transformers currently provides ready configurations for converting BERT, BART, RoBERTa and several other models to ONNX. Can we extend this to also support LayoutLM?

Motivation

ONNX is quickly becoming the default runtime environment in many production settings. Ideally, all models supported by the library should have an easy path to conversion.

Your contribution

I am willing to submit a PR that implements this.

NielsRogge · 2021-08-27T13:01:19Z

Sure that would be great. LayoutLM is literally only adding 4 additional embedding layers to BERT:

transformers/src/transformers/models/layoutlm/modeling_layoutlm.py

Lines 66 to 69 in a3f96f3

    
           self.x_position_embeddings = nn.Embedding(config.max_2d_position_embeddings, config.hidden_size) 
        
           self.y_position_embeddings = nn.Embedding(config.max_2d_position_embeddings, config.hidden_size) 
        
           self.h_position_embeddings = nn.Embedding(config.max_2d_position_embeddings, config.hidden_size) 
        
           self.w_position_embeddings = nn.Embedding(config.max_2d_position_embeddings, config.hidden_size)

So I guess it won't be that difficult to support?

cc @mfuntowicz

NielsRogge · 2021-08-27T13:30:35Z

The guide written here is very helpful: https://huggingface.co/transformers/serialization.html?highlight=onnx#converting-an-onnx-model-using-the-transformers-onnx-package

nishprabhu · 2021-08-27T14:33:40Z

Thanks! It was very useful!

* Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Removed regression/ folder * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Fixed import error * Remove unnecessary import statements * Changed max_2d_positions from class variable to instance variable of the config class * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Add support for exporting PyTorch LayoutLM to ONNX * cleanup * Fixed import error * Changed max_2d_positions from class variable to instance variable of the config class * Use super class generate_dummy_inputs method Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Add support for Masked LM, sequence classification and token classification Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Removed uncessary import and method * Fixed code styling * Raise error if PyTorch is not installed * Remove unnecessary import statement Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

github-actions · 2021-09-26T15:02:04Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

* Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Removed regression/ folder * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Fixed import error * Remove unnecessary import statements * Changed max_2d_positions from class variable to instance variable of the config class * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Add support for exporting PyTorch LayoutLM to ONNX * cleanup * Fixed import error * Changed max_2d_positions from class variable to instance variable of the config class * Use super class generate_dummy_inputs method Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Add support for Masked LM, sequence classification and token classification Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Removed uncessary import and method * Fixed code styling * Raise error if PyTorch is not installed * Remove unnecessary import statement Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

d-v-dlee · 2022-10-07T23:38:55Z

dumb question - how do i format the ORT inputs for LayoutLM onnx? does anyone have an example of LayoutLM ONNX inference?

I'm trying to pass in the output of a collator into the onxx session. Its not liking the bounding box tensor since it its dimensions are different than input_ids, token_type_ids and attention_mask.

This was referenced Aug 27, 2021

Layoutlm onnx support #13305

Closed

Layoutlm onnx support (Issue #13300) #13349

Closed

nishprabhu mentioned this issue Sep 14, 2021

Layoutlm onnx support (Issue #13300) #13562

Merged

5 tasks

nishprabhu mentioned this issue Sep 23, 2021

Fix LayoutLM ONNX test error #13710

Merged

5 tasks

github-actions bot closed this as completed Oct 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for converting LayoutLM to ONNX #13300

Support for converting LayoutLM to ONNX #13300

nishprabhu commented Aug 27, 2021

NielsRogge commented Aug 27, 2021 •

edited

NielsRogge commented Aug 27, 2021

nishprabhu commented Aug 27, 2021

github-actions bot commented Sep 26, 2021

d-v-dlee commented Oct 7, 2022

Support for converting LayoutLM to ONNX #13300

Support for converting LayoutLM to ONNX #13300

Comments

nishprabhu commented Aug 27, 2021

🚀 Feature request

Motivation

Your contribution

NielsRogge commented Aug 27, 2021 • edited

NielsRogge commented Aug 27, 2021

nishprabhu commented Aug 27, 2021

github-actions bot commented Sep 26, 2021

d-v-dlee commented Oct 7, 2022

NielsRogge commented Aug 27, 2021 •

edited