Added support for other features for already supported models #14358

michaelbenayoun · 2021-11-10T15:06:32Z

What does this PR do?

This PR adds support for almost all the features available for already supported models.

Main contributions:

OnnxSeq2SeqConfigWithPast: a new class inheriting from OnnxConfigWithPast designed specifically for seq2seq models, this should make things easier for the community to contribute.
Tests refactoring and parameterization: now every (model, feature) export pair is tested, and is considered as a standalone test (compared to before when everything was considered to be one big test).
A lot of new features (a feature is a task plus the choice or not to use past_key_values), that have been requested by the community (check the list of supported feautres below)

Features now supported:

For BERT like models: default, sequence-classification, token-classification and question-answering (multiple-choice will be added later).
For causal language models (GPT-2 and GPT-neo): default, default-with-past, causal-lm, causal-lm-with-past, sequence-classification and token-classification (only for GPT2).
For Seq2Seq models (T5, BART, mBART):
- T5, BART, mBART: default, default-with-past, seq2seq-lm, seq2seq-lm-with-past
- BART, mBART: causal-lm, causal-lm-with-past, sequence-classification, question-answering

lewtun

These changes are looking really good - the code is much more elegant and modular than before 🤩 !

Before we merge, I think it would be good to do a few sanity checks with "real inputs" for the seq2seq models. For example, just checking that we get agreement with these examples from the docs would be nice:

lewtun · 2021-12-03T16:18:54Z

src/transformers/models/bart/configuration_bart.py

+
+    @property
+    def atol_for_validation(self) -> float:
+        return 1e-2


Cool idea to allow different atol values per model!

Does the tolerance of 1e-2 for BART reflect the work in progress on this model? (Naively, I would have expected 1e-3 or smaller)

I think now this is because of the work in progress, will try things out with smaller values before merging.

lewtun · 2021-12-03T16:21:00Z

src/transformers/models/gpt2/configuration_gpt2.py

        return ordered_inputs
+
+    @property
+    def default_onnx_opset(self) -> int:


Nice idea to define default operator sets this way :)

lewtun · 2021-12-03T16:27:47Z

src/transformers/onnx/__init__.py

+    EXTERNAL_DATA_FORMAT_SIZE_LIMIT,
+    OnnxConfig,
+    OnnxConfigWithPast,
+    OnnxSeq2SeqConfigWithPast,


From the end-user perspective, should one use OnnxSeq2SeqConfigWithPast for all seq2seq models? If yes, we might want to explain this when we extend the documentation

Yes definitely!

lewtun · 2021-12-03T20:43:19Z

src/transformers/onnx/config.py

+            decoder_shape = (
+                batch,
+                num_decoder_attention_heads,
+                1,


Maybe it's a good idea to add a small comment here that we set the decoder sequence length to 1 here because we only use the last decoder_input_ids when using pre-computed values with past_key_values?

(It's probably obvious to people deeply familiar with the transformers codebase, but might not be obvious to people trying to export their own models)

You are right, I will add them.

lewtun · 2021-12-03T20:52:19Z

src/transformers/onnx/config.py

+            remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
+
+            for _ in range(min_num_layers):
+                common_inputs["past_key_values"].append(


Perhaps we should add a comment here to explain why past_key_values involving tuples of 4 tensors?

You are right!

lewtun · 2021-12-03T21:02:42Z

tests/test_onnx_v2.py

-        ("LayoutLM", "microsoft/layoutlm-base-uncased", LayoutLMModel, LayoutLMConfig, LayoutLMOnnxConfig),
-        ("MBart", "sshleifer/tiny-mbart", MBartModel, MBartConfig, MBartOnnxConfig),
-        # ("T5", "t5-small", T5Model, T5Config, T5OnnxConfig),
+    PYTORCH_EXPORT_MODELS = {


This is much more elegant!

lewtun · 2021-12-03T21:05:57Z

tests/test_onnx_v2.py

+            config.pad_token_id = tokenizer.eos_token_id
+
+        model_class = FeaturesManager.get_model_class_for_feature(feature)
+        model = model_class.from_config(config)


As discussed offline, would it make more sense to loading the model using a pretrained checkpoint instead of the random initialization from a config?

I think using pretrained weights would be a more realistic test of how the ONNX export is used in real applications.

But maybe we can leave this as a TODO for a follow-up PR since this one is getting pretty large :)

lewtun

Thank you for this big refactoring and adding proper support for BART / mBART 🚀 !

I've manually tested T5, BART, and mBART with various outputs and the max absolute difference is between 1e-5 to 1e-4 which is perfectly fine IMO :)

Great work - I'm looking forward to building on top of this!

lewtun · 2021-12-07T20:52:48Z

src/transformers/models/bart/configuration_bart.py

+
+    @property
+    def outputs(self) -> Mapping[str, Mapping[int, str]]:
+        if self.task in ["default", "default-with-past", "seq2seq-lm", "seq2seq-lm-with-past"]:


Perhaps I'm misunderstanding something, but why are default-with-past and seq2seq-lm-with past included in this list?

Looking at the if/else logic, it seems we extract the common_inputs for past key values in the else clause, so I wonder if we should be using:

Suggested change

if self.task in ["default", "default-with-past", "seq2seq-lm", "seq2seq-lm-with-past"]:

if self.task in ["default", "seq2seq-lm"]:

You're right, actualy the task is never "with-past" as it is parsed before being passed to the OnnxConfig.
So should definitely delete that.

lewtun · 2021-12-07T20:54:29Z

src/transformers/models/bart/configuration_bart.py

+                    import torch
+                batch = common_inputs["input_ids"].shape[0]
+                encoder_seq_length = common_inputs["input_ids"].shape[1]
+                # decoder_seq_length = ordered_inputs["decoder_input_ids"].shape[1]


Perhaps we can delete this bit of dead code?

lewtun · 2021-12-07T20:59:44Z

src/transformers/models/bart/configuration_bart.py

+        return common_inputs
+
+    def _flatten_past_key_values_(self, flattened_output, name, idx, t):
+        if self.task in ["default", "default-with-past", "seq2seq-lm", "seq2seq-lm-with-past"]:


Do we need to flatten past_key_values if the task is default or seq2seq-lm?

Yes, because you can have default and self.use_past = True (when you are doing default-with-past).
As mentioned above, "default-with-past" will be parsed to something like OnnxConfig(task=feature.replace("-with-past", ""), use_past="with-past" in feature).

Perfect, thank you for the clarification!

lewtun · 2021-12-07T21:02:34Z

src/transformers/onnx/features.py

+            "default",
+            "masked-lm",
+            "sequence-classification",
+            # "multiple-choice",


Should we delete this bit of dead code? (And similarly for the other occurrences of multiple-choice?)

I wonder because the plan is to add support for this too one day.
I do not know how much work that represents.

OK let's keep it then :)

lewtun · 2021-12-07T21:04:32Z

tests/test_onnx_v2.py

-        # ("T5", T5Config)
-    }
+    SUPPORTED_WITH_PAST_CONFIGS = {}
+    # SUPPORTED_WITH_PAST_CONFIGS = {


Should these commented out configs now be included or are the unit tests not ready for these architectures yet?

lewtun · 2021-12-08T11:17:24Z

src/transformers/onnx/config.py

+
+        # Generate decoder inputs
+        decoder_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
+            tokenizer, batch_size, 1, is_pair, framework


Following up on our discussion offline, should the sequence length be fixed at 1 in this base class? Would it be more appropriate to use seq_length (or some multiple thereof to test differing encoder / decoder sequence lengths)?

…#14358)" This reverts commit 0c70f14.

…#14358)" (#14679) This reverts commit 0c70f14.

…d models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c.

* Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import

* First commit to add MarianMT to ONNX * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward() * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature * Style fix * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * Add default task for MarianMT ONNX * Remove automatic creation of decoder_input_ids * Extend inputs and outputs for MarianMT ONNX config * Add MarianMT to ONNX unit tests * Refactor * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Add past_key_values and fix dummy decoder inputs Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations. * Refactor MarianOnnxConfig to remove custom past_key_values logic * Fix quality * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Refactor Marian export to account for base changes * Fix copies * Implemented suggestions * Extend support for causal LM * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import * Remove ONNX model * Remove redundant class method * Tidy up imports * Fix quality * Refactor dummy input function * Add copied from statements to Marian config functions * Remove false copied from comments * Fix copy from comment Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

…gface#14358) * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Added comments * Another sequence length of the past_key_values

…huggingface#14358)" (huggingface#14679) This reverts commit 0c70f14.

* Revert "Revert "Added support for other features for already supported models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import

* First commit to add MarianMT to ONNX * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward() * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature * Style fix * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * Add default task for MarianMT ONNX * Remove automatic creation of decoder_input_ids * Extend inputs and outputs for MarianMT ONNX config * Add MarianMT to ONNX unit tests * Refactor * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Add past_key_values and fix dummy decoder inputs Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations. * Refactor MarianOnnxConfig to remove custom past_key_values logic * Fix quality * Revert "Revert "Added support for other features for already supported models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Refactor Marian export to account for base changes * Fix copies * Implemented suggestions * Extend support for causal LM * Revert "Revert "Added support for other features for already supported models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Revert "Revert "Added support for other features for already supported models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import * Remove ONNX model * Remove redundant class method * Tidy up imports * Fix quality * Refactor dummy input function * Add copied from statements to Marian config functions * Remove false copied from comments * Fix copy from comment Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

girishnadiger-gep · 2022-03-01T15:41:08Z

@michaelbenayoun @lewtun @Albertobegue
any idea when this PR will be merged ?

lewtun · 2022-03-01T15:44:05Z

@michaelbenayoun @lewtun @Albertobegue any idea when this PR will be merged ?

Hey @girishnadiger-gep this PR was superseded by #14700 which has been merged sometime ago. Is there a specific issue or feature missing that you're interested in?

girishnadiger-gep · 2022-03-01T15:47:36Z

Hi @lewtun ,
Thanks for getting back. I was trying to implement BART Summarization on onnx. Facing a weird issue where the inputs to the model are 4 ('input_ids', 'attention_mask', 'decoder_input_ids', 'decoder_attention_mask'), but unable to figure out how to get 'decoder_input_ids' and 'decoder_attention_mask' values to feed to the onnx model.

I've converted a BART Large model to onnx model using 'seq2seq-lm' feature, but thought i'm missing something here so asked in this forum.

lewtun · 2022-03-01T15:58:50Z

Ah for that you can probably adapt the example that I used for the Marian PR in #14586

FYI we also have a forum (https://discuss.huggingface.co/) which is better suited for these type of questions - we try to use GitHub issues for bug reports / feature requests

michaelbenayoun added 6 commits November 10, 2021 16:05

Added support for other features for already supported models

b744cb2

Partial support for causal and seq2seq models

4a51e13

Partial support for causal and seq2seq models

17825c1

OnnxSeq2SeqConfigWithPast to support seq2seq models

f109dde

Parameterized the onnx tests

6514c58

Restored run_mlm.py

0462ffe

michaelbenayoun marked this pull request as ready for review December 3, 2021 14:47

Restored run_mlm.py

9e66937

lewtun reviewed Dec 3, 2021

View reviewed changes

michaelbenayoun added 2 commits December 7, 2021 16:31

[WIP] BART update

1e0d074

BART and MBART

720b41a

lewtun approved these changes Dec 7, 2021

View reviewed changes

Added comments

768825e

lewtun reviewed Dec 8, 2021

View reviewed changes

lewtun mentioned this pull request Dec 8, 2021

Add ONNX support for MarianMT models #14586

Merged

9 tasks

Another sequence length of the past_key_values

051df16

lewtun mentioned this pull request Dec 8, 2021

Bart model converted ONNX inference #14222

Closed

lewtun merged commit 0c70f14 into huggingface:master Dec 8, 2021

lewtun added a commit that referenced this pull request Dec 8, 2021

Revert "Added support for other features for already supported models (…

0df2761

…#14358)" This reverts commit 0c70f14.

lewtun mentioned this pull request Dec 8, 2021

Revert "Added support for other features for already supported models" #14679

Merged

sgugger pushed a commit that referenced this pull request Dec 8, 2021

Revert "Added support for other features for already supported models (…

0f4e39c

…#14358)" (#14679) This reverts commit 0c70f14.

swoook mentioned this pull request Dec 9, 2021

Request a feature or examples to export BART for sequence classification to ONNX Runtime (ORT) swoook/transformers#1

Closed

michaelbenayoun added a commit to michaelbenayoun/transformers that referenced this pull request Dec 9, 2021

Revert "Revert "Added support for other features for already supporte…

b071fb6

…d models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c.

michaelbenayoun mentioned this pull request Dec 9, 2021

Onnx enable tasks for supported models (part 2) #14700

Merged

michaelbenayoun added a commit to michaelbenayoun/transformers that referenced this pull request Dec 21, 2021

Revert "Revert "Added support for other features for already supporte…

9dedf6f

…d models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c.

michaelbenayoun added a commit to michaelbenayoun/transformers that referenced this pull request Dec 22, 2021

Revert "Revert "Added support for other features for already supporte…

2473e8b

…d models (huggingface#14358)" (huggingface#14679)" This reverts commit 0f4e39c.

lewtun mentioned this pull request Jan 10, 2022

Add Tensorflow handling of ONNX conversion #13831

Merged

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022

Revert "Added support for other features for already supported models (…

da94fab

…huggingface#14358)" (huggingface#14679) This reverts commit 0c70f14.

andylolu2 mentioned this pull request Mar 23, 2022

FeaturesManager.get_model_from_feature should be a staticmethod #16347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for other features for already supported models #14358

Added support for other features for already supported models #14358

michaelbenayoun commented Nov 10, 2021 •

edited

lewtun left a comment

lewtun Dec 3, 2021

michaelbenayoun Dec 7, 2021

lewtun Dec 3, 2021

lewtun Dec 3, 2021

michaelbenayoun Dec 7, 2021

lewtun Dec 3, 2021

michaelbenayoun Dec 7, 2021

lewtun Dec 3, 2021

michaelbenayoun Dec 7, 2021

lewtun Dec 3, 2021

lewtun Dec 3, 2021

lewtun left a comment

lewtun Dec 7, 2021

michaelbenayoun Dec 8, 2021

lewtun Dec 7, 2021

lewtun Dec 7, 2021

michaelbenayoun Dec 8, 2021

lewtun Dec 8, 2021

lewtun Dec 7, 2021

michaelbenayoun Dec 8, 2021

lewtun Dec 8, 2021

lewtun Dec 7, 2021

lewtun Dec 8, 2021

girishnadiger-gep commented Mar 1, 2022

lewtun commented Mar 1, 2022

girishnadiger-gep commented Mar 1, 2022 •

edited

lewtun commented Mar 1, 2022

	if self.task in ["default", "default-with-past", "seq2seq-lm", "seq2seq-lm-with-past"]:
	if self.task in ["default", "seq2seq-lm"]:

Added support for other features for already supported models #14358

Added support for other features for already supported models #14358

Conversation

michaelbenayoun commented Nov 10, 2021 • edited

What does this PR do?

lewtun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lewtun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

girishnadiger-gep commented Mar 1, 2022

lewtun commented Mar 1, 2022

girishnadiger-gep commented Mar 1, 2022 • edited

lewtun commented Mar 1, 2022

michaelbenayoun commented Nov 10, 2021 •

edited

girishnadiger-gep commented Mar 1, 2022 •

edited