Layoutlm onnx support (Issue #13300) #13562

nishprabhu · 2021-09-14T12:39:29Z

What does this PR do?

This PR extends ONNX support to LayoutLM as explained in https://huggingface.co/transformers/serialization.html?highlight=onnx#converting-an-onnx-model-using-the-transformers-onnx-package

Fixes Issue #13300

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@NielsRogge @mfuntowicz @LysandreJik

…/transformers into layoutlm-onnx-support Cleanup

…the config class

…/transformers into layoutlm-onnx-support Fetch and rebase from upstream

src/transformers/onnx/features.py

src/transformers/models/layoutlm/configuration_layoutlm.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

…cation Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

src/transformers/onnx/__init__.py

michaelbenayoun · 2021-09-15T12:42:22Z

Looks almost ready, great work @nishprabhu!!
Thank you for your contribution.

LysandreJik

This looks good to me. Do you want to have a look @NielsRogge @mfuntowicz ?

nishprabhu · 2021-09-20T15:02:48Z

Hi, any update about this? Any other changes required?
@LysandreJik @mfuntowicz @NielsRogge

NielsRogge · 2021-09-20T16:17:58Z

LGTM!

LysandreJik · 2021-09-22T23:16:11Z

This is making the LayoutLM ONNX test fail for the following reason:

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <transformers.models.layoutlm.configuration_layoutlm.LayoutLMOnnxConfig object at 0x7f344811ceb0>
tokenizer = PreTrainedTokenizerFast(name_or_path='microsoft/layoutlm-base-uncased', vocab_size=30522, model_max_len=512, is_fast=T...okens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'})
batch_size = -1, seq_length = -1, is_pair = False
framework = <TensorType.PYTORCH: 'pt'>

    def generate_dummy_inputs(
        self,
        tokenizer: PreTrainedTokenizer,
        batch_size: int = -1,
        seq_length: int = -1,
        is_pair: bool = False,
        framework: Optional[TensorType] = None,
    ) -> Mapping[str, Any]:
        """
        Generate inputs to provide to the ONNX exporter for the specific framework
    
        Args:
            tokenizer: The tokenizer associated with this model configuration
            batch_size: The batch size (int) to export the model for (-1 means dynamic axis)
            seq_length: The sequence length (int) to export the model for (-1 means dynamic axis)
            is_pair: Indicate if the input is a pair (sentence 1, sentence 2)
            framework: The framework (optional) the tokenizer will generate tensor for
    
        Returns:
            Mapping[str, Tensor] holding the kwargs to provide to the model's forward function
        """
    
        input_dict = super().generate_dummy_inputs(tokenizer, batch_size, seq_length, is_pair, framework)
    
        # Generate a dummy bbox
        box = [48, 84, 73, 128]
    
        if not framework == TensorType.PYTORCH:
            raise NotImplementedError("Exporting LayoutLM to ONNX is currently only supported for PyTorch.")
    
        if not is_torch_available():
            raise ValueError("Cannot generate dummy inputs without PyTorch installed.")
        import torch
    
>       input_dict["bbox"] = torch.tensor(
            [
                [0] * 4,
                *[box] * seq_length,
                [self.max_2d_positions] * 4,
            ]
        ).tile(batch_size, 1, 1)
E       RuntimeError: Trying to create tensor with negative dimension -1: [-1, 2, 4]

Could you take a look @nishprabhu @michaelbenayoun?

Will skip this test in the meantime.

nishprabhu · 2021-09-23T05:35:07Z

I think we have to import the compute_effective_axis_dimension function from the onnx module and compute the batch_size and seq_length dimensions in the configuration_layoutlm.py file. We are currently using -1 as the value for both dimensions which is causing the error. @michaelbenayoun @LysandreJik

* Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Removed regression/ folder * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Fixed import error * Remove unnecessary import statements * Changed max_2d_positions from class variable to instance variable of the config class * Add support for exporting PyTorch LayoutLM to ONNX * Added tests for converting LayoutLM to ONNX * cleanup * Add support for exporting PyTorch LayoutLM to ONNX * cleanup * Fixed import error * Changed max_2d_positions from class variable to instance variable of the config class * Use super class generate_dummy_inputs method Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Add support for Masked LM, sequence classification and token classification Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Removed uncessary import and method * Fixed code styling * Raise error if PyTorch is not installed * Remove unnecessary import statement Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

nishprabhu and others added 24 commits August 27, 2021 17:36

Add support for exporting PyTorch LayoutLM to ONNX

a14d851

Added tests for converting LayoutLM to ONNX

21a845f

Merge branch 'huggingface:master' into layoutlm-onnx-support

68bb7c1

Add support for exporting PyTorch LayoutLM to ONNX

402d542

Added tests for converting LayoutLM to ONNX

f835ae7

cleanup

7a4f00d

Merge branch 'layoutlm-onnx-support' of https://github.com/nishprabhu…

0424055

…/transformers into layoutlm-onnx-support Cleanup

Removed regression/ folder

9662261

Add support for exporting PyTorch LayoutLM to ONNX

056d085

Added tests for converting LayoutLM to ONNX

aedb3d1

cleanup

6153781

Fixed import error

cb28af7

Fixed merge conflicts in configuration_layoutlm.py

f815649

Remove unnecessary import statements

f6a78cf

Changed max_2d_positions from class variable to instance variable of …

c99c6f3

…the config class

Add support for exporting PyTorch LayoutLM to ONNX

4214721

Added tests for converting LayoutLM to ONNX

e0a1d86

cleanup

b94f383

Add support for exporting PyTorch LayoutLM to ONNX

36b6024

cleanup

9162f5d

Fixed import error

9fe974b

Changed max_2d_positions from class variable to instance variable of …

b5010a9

…the config class

Merge branch 'layoutlm-onnx-support' of https://github.com/nishprabhu…

ebf4f4a

…/transformers into layoutlm-onnx-support Fetch and rebase from upstream

Merge branch 'huggingface:master' into layoutlm-onnx-support

5993a2d

mfuntowicz requested review from michaelbenayoun, mfuntowicz and LysandreJik September 14, 2021 13:24

michaelbenayoun reviewed Sep 14, 2021

View reviewed changes

src/transformers/onnx/features.py Outdated Show resolved Hide resolved

michaelbenayoun reviewed Sep 14, 2021

View reviewed changes

src/transformers/models/layoutlm/configuration_layoutlm.py Show resolved Hide resolved

michaelbenayoun reviewed Sep 14, 2021

View reviewed changes

src/transformers/models/layoutlm/configuration_layoutlm.py Outdated Show resolved Hide resolved

michaelbenayoun reviewed Sep 14, 2021

View reviewed changes

src/transformers/models/layoutlm/configuration_layoutlm.py Outdated Show resolved Hide resolved

nishprabhu and others added 5 commits September 15, 2021 13:21

Use super class generate_dummy_inputs method

6e37eb8

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Add support for Masked LM, sequence classification and token classifi…

e3f33ee

…cation Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Removed uncessary import and method

84277c9

Fixed code styling

cdcf3b7

Raise error if PyTorch is not installed

c569d33

michaelbenayoun reviewed Sep 15, 2021

View reviewed changes

src/transformers/onnx/__init__.py Outdated Show resolved Hide resolved

Remove unnecessary import statement

381a040

michaelbenayoun approved these changes Sep 15, 2021

View reviewed changes

LysandreJik approved these changes Sep 15, 2021

View reviewed changes

LysandreJik merged commit ddd4d02 into huggingface:master Sep 21, 2021

nishprabhu deleted the layoutlm-onnx-support branch September 22, 2021 03:54

LysandreJik mentioned this pull request Sep 22, 2021

Skip ONNX LayoutLM test #13702

Closed

nishprabhu restored the layoutlm-onnx-support branch September 23, 2021 05:15

nishprabhu deleted the layoutlm-onnx-support branch September 23, 2021 05:20

nishprabhu mentioned this pull request Sep 23, 2021

Fix LayoutLM ONNX test error #13710

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Layoutlm onnx support (Issue #13300) #13562

Layoutlm onnx support (Issue #13300) #13562

nishprabhu commented Sep 14, 2021

michaelbenayoun commented Sep 15, 2021

LysandreJik left a comment

nishprabhu commented Sep 20, 2021

NielsRogge commented Sep 20, 2021

LysandreJik commented Sep 22, 2021

nishprabhu commented Sep 23, 2021

Layoutlm onnx support (Issue #13300) #13562

Layoutlm onnx support (Issue #13300) #13562

Conversation

nishprabhu commented Sep 14, 2021

What does this PR do?

Before submitting

Who can review?

michaelbenayoun commented Sep 15, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

nishprabhu commented Sep 20, 2021

NielsRogge commented Sep 20, 2021

LysandreJik commented Sep 22, 2021

nishprabhu commented Sep 23, 2021