Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

add mt5 to ORTConfigManager conf list #341

Merged
merged 5 commits into from Oct 2, 2022

Conversation

chainyo
Copy link
Contributor

@chainyo chainyo commented Aug 9, 2022

What does this PR do?

Add MT5 to ORTConfigManager.

Fixes #321

I re-arranged in alphabetical order all available models. I can put it back like it was if needed. 馃

@JingyaHuang

Aside from this PR, I was wondering if opening an issue like huggingface/transformers#16308 for implementing all available onnx models in the ORTConfigManager could be nice?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 9, 2022

The documentation is not available anymore as the PR was closed or merged.

@JingyaHuang
Copy link
Collaborator

Hi @chainyo, thank you for adding mt5 in transformers and in optimum!

Sorting the models in alphabetical order makes sense to me. Can you also add a test of mt5 on tests/onnxruntime/test_optimization.py with the model hf-internal-testing/tiny-random-mt5? Thanks!

@chainyo
Copy link
Contributor Author

chainyo commented Aug 10, 2022

Same as above, I re-arranged in alphabetical order all tested architectures.

Because MT5 has no sequence-classification feature, I introduced a new argument: model_feature. Let me know if it's ok or not.
But it could help with other models not designed for sequence-classification in the future.

EDIT: tests are failing for MT5 because it's not the main branch fro transformers installed while testing. Should we wait until it's included in a future release?

Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactoring looks neat to me, pinging @regisss and @michaelbenayoun for some inputs.

Copy link
Contributor

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this @chainyo!!

@regisss
Copy link
Contributor

regisss commented Aug 12, 2022

The build PR doc test should pass now

@regisss
Copy link
Contributor

regisss commented Aug 12, 2022

As for opening an issue for implementing all available ONNX models in the ORTConfigManager, why not. It could be a good way to track which models still have to be added. What do you think @JingyaHuang @echarlaix @mfuntowicz @fxmarty ?

@JingyaHuang
Copy link
Collaborator

Besides, it seems that the CI of the optimization has one failed test:

2022-08-10T14:39:30.5123260Z =========================== short test summary info ============================
2022-08-10T14:39:30.5123701Z FAILED onnxruntime/test_optimization.py::ORTOptimizerTest::test_optimize_6 - ...

I can't see details about where it failed. Can you run the test locally to get the complete log of the failure? Thank you!

@JingyaHuang
Copy link
Collaborator

As for opening an issue for implementing all available ONNX models in the ORTConfigManager, why not. It could be a good way to track which models still have to be added. What do you think @JingyaHuang @echarlaix @mfuntowicz @fxmarty ?

IMO, it makes sense to keep them all in the test as far as it won't slow down the CI. When it turns to be too slow, we can archive a part of them as slow tests.

@regisss
Copy link
Contributor

regisss commented Aug 12, 2022

As for opening an issue for implementing all available ONNX models in the ORTConfigManager, why not. It could be a good way to track which models still have to be added. What do you think @JingyaHuang @echarlaix @mfuntowicz @fxmarty ?

IMO, it makes sense to keep them all in the test as far as it won't slow down the CI. When it turns to be too slow, we can archive a part of them as slow tests.

I was actually talking about the suggestion of opening an issue such as this one to keep track of the models to add to ORTConfigManager.

@JingyaHuang
Copy link
Collaborator

JingyaHuang commented Aug 12, 2022

@regisss oh sorry, I misunderstood it. I think that is a good idea! Both users and contributors could have a better idea about what has been integrated, what needs to be integrated, and probably some references on how to do that.

@regisss
Copy link
Contributor

regisss commented Aug 12, 2022

@regisss oh sorry, I misunderstood it. I think that is a good idea! Both users and contributors could have a better idea about what has been integrated, what needs to be integrated, and probably some references on how to do that.

Yes exactly :)

@chainyo Feel free to create such an issue 馃憤

@chainyo
Copy link
Contributor Author

chainyo commented Aug 16, 2022

@regisss, oh sorry, I misunderstood it. I think that is a good idea! Both users and contributors could have a better idea about what has been integrated, what needs to be integrated, and probably some references on how to do that.

Yes exactly :)

@chainyo Feel free to create such an issue +1

Cool, I will open it right now!

Besides, it seems that the CI of the optimization has one failed test:

2022-08-10T14:39:30.5123260Z =========================== short test summary info ============================
2022-08-10T14:39:30.5123701Z FAILED onnxruntime/test_optimization.py::ORTOptimizerTest::test_optimize_6 - ...

I can't see details about where it failed. Can you run the test locally to get the complete log of the failure? Thank you!

I will take a look at the failure message.

@chainyo
Copy link
Contributor Author

chainyo commented Aug 16, 2022

Besides, it seems that the CI of the optimization has one failed test:

2022-08-10T14:39:30.5123260Z =========================== short test summary info ============================
2022-08-10T14:39:30.5123701Z FAILED onnxruntime/test_optimization.py::ORTOptimizerTest::test_optimize_6 - ...

I can't see details about where it failed. Can you run the test locally to get the complete log of the failure? Thank you!

When you scroll up, you can access the full traceback. It's still due to MT5 not being implemented in Transformers. Can you re-launch the CI tests, or should I commit to rerun them automatically? @JingyaHuang

=================================== FAILURES ===================================
_______________________ ORTOptimizerTest.test_optimize_6 _______________________
[gw1] linux -- Python 3.8.13 /opt/hostedtoolcache/Python/3.8.13/x64/bin/python

a = (<test_optimization.ORTOptimizerTest testMethod=test_optimize_6>,)

    @wraps(func)
    def standalone_func(*a):
>       return func(*(a + p.args), **p.kwargs)

/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/parameterized/parameterized.py:533: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
onnxruntime/test_optimization.py:79: in test_optimize
    optimizer = ORTOptimizer(tokenizer, model, feature=model_feature)
/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/optimum/onnxruntime/optimization.py:88: in __init__
    self._model_type, onnx_config_factory = FeaturesManager.check_supported_model_or_raise(model, feature=feature)
/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/transformers/onnx/features.py:584: in check_supported_model_or_raise
    model_features = FeaturesManager.get_supported_features_for_model_type(model_type, model_name=model_name)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

model_type = 'mt5', model_name = ''

    @staticmethod
    def get_supported_features_for_model_type(
        model_type: str, model_name: Optional[str] = None
    ) -> Dict[str, Callable[[PretrainedConfig], OnnxConfig]]:
        """
        Tries to retrieve the feature -> OnnxConfig constructor map from the model type.
    
        Args:
            model_type (`str`):
                The model type to retrieve the supported features for.
            model_name (`str`, *optional*):
                The name attribute of the model object, only used for the exception message.
    
        Returns:
            The dictionary mapping each feature to a corresponding OnnxConfig constructor.
        """
        model_type = model_type.lower()
        if model_type not in FeaturesManager._SUPPORTED_MODEL_TYPE:
            model_type_and_model_name = f"{model_type} ({model_name})" if model_name else model_type
>           raise KeyError(
                f"{model_type_and_model_name} is not supported yet. "
                f"Only {list(FeaturesManager._SUPPORTED_MODEL_TYPE.keys())} are supported. "
                f"If you want to support {model_type} please propose a PR or open up an issue."
            )
E           KeyError: "mt5 is not supported yet. Only ['albert', 'bart', 'beit', 'bert', 'big-bird', 'bigbird-pegasus', 'blenderbot', 'blenderbot-small', 'bloom', 'camembert', 'codegen', 'convbert', 'convnext', 'data2vec-text', 'deberta', 'deberta-v2', 'deit', 'detr', 'distilbert', 'electra', 'flaubert', 'gpt2', 'gptj', 'gpt-neo', 'ibert', 'layoutlm', 'layoutlmv3', 'levit', 'longt5', 'marian', 'mbart', 'mobilebert', 'mobilevit', 'm2m-100', 'perceiver', 'resnet', 'roberta', 'roformer', 'squeezebert', 't5', 'vit', 'xlm', 'xlm-roberta', 'yolos'] are supported. If you want to support mt5 please propose a PR or open up an issue."

/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/transformers/onnx/features.py:486: KeyError

@JingyaHuang
Copy link
Collaborator

Hi @chainyo,

Thanks for explaining, that's clear. Indeed, as the CI leverages the latest transformers, it won't pass until your contribution of MT5 ONNX config makes its way to a release of transformers.

@customer101
Copy link

Hi @chainyo,
I really appreciate the work done to support mT5 on optimum, this saves me a lot of time.
I have faced some issues after running this code :

model_path = "./mt5_model"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = ORTModelForSeq2SeqLM.from_pretrained(model_path, from_transformers=True)

it fails at optimum/onnxruntime/modeling_seq2seq.py :

        # Export the decoder with the past key values
        if use_cache:
            export(
                preprocessor=tokenizer,
                model=decoder_with_lm_head,
                config=onnx_config_decoder_with_past,
                opset=onnx_opset,
                output=save_dir.joinpath(ONNX_DECODER_WITH_PAST_NAME),
            )

Error:

RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 85 but got size 64 for tensor number 1 in the list.

If I set use_cache=false, it works fine. Does setting it to false will degrade the speed-up expected from onnx optimizations?

@chainyo
Copy link
Contributor Author

chainyo commented Aug 25, 2022

Hi @chainyo, I really appreciate the work done to support mT5 on optimum, this saves me a lot of time. I have faced some issues after running this code :

model_path = "./mt5_model"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = ORTModelForSeq2SeqLM.from_pretrained(model_path, from_transformers=True)

it fails at optimum/onnxruntime/modeling_seq2seq.py :

        # Export the decoder with the past key values
        if use_cache:
            export(
                preprocessor=tokenizer,
                model=decoder_with_lm_head,
                config=onnx_config_decoder_with_past,
                opset=onnx_opset,
                output=save_dir.joinpath(ONNX_DECODER_WITH_PAST_NAME),
            )

Error:

RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 85 but got size 64 for tensor number 1 in the list.

If I set use_cache=false, it works fine. Does setting it to false will degrade the speed-up expected from onnx optimizations?

HI @customer101, could you check your model's config.json file and tell me the architecture used? Did you convert the MT5 model to ONNX with the main branch of Transformers?

Are you using it in a pipeline or with vanilla PyTorch? Where do you add the use_cache=False argument?

@customer101
Copy link

I used transformers main branch, the model is vanilla PyTorch,
this line:

model = ORTModelForSeq2SeqLM.from_pretrained(model_path, from_transformers=True, use_cache=False)

does the conversion, then when I save the model it dumps 2 onnx models(encoder and decoder).
Here is the config.json:

{
  "_name_or_path": "mt5_model",
  "architectures": [
    "MT5ForConditionalGeneration"
  ],
  "d_ff": 1024,
  "d_kv": 64,
  "d_model": 512,
  "decoder_start_token_id": 0,
  "dropout_rate": 0.1,
  "early_stopping": true,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "layer_norm_epsilon": 1e-06,
  "max_length": 256,
  "model_type": "mt5",
  "num_beams": 3,
  "num_decoder_layers": 8,
  "num_heads": 6,
  "num_layers": 8,
  "pad_token_id": 0,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "tokenizer_class": "T5Tokenizer",
  "torch_dtype": "float32",
  "transformers_version": "4.17.0",
  "use_cache": true,
  "vocab_size": 250112
}

I noticed that mt5 optimizations are not implemented yet in onnxruntime, so I think I won't gain much speed-up by converting to onnx.
Let me know if you need further information, good luck @chainyo .

@irg1008 irg1008 mentioned this pull request Sep 1, 2022
@chainyo
Copy link
Contributor Author

chainyo commented Sep 15, 2022

HI @JingyaHuang @regisss

Transformers v4.22.0 has included the MT5 OnnxConfig and stuff. Can we relaunch tests and see if all is fine here?

@JingyaHuang
Copy link
Collaborator

Hi @chainyo,

Sure, before launching the tests again, can you do a rebase with the main branch of Optimum? Thank you!

@chainyo chainyo reopened this Sep 15, 2022
@chainyo
Copy link
Contributor Author

chainyo commented Sep 15, 2022

Sure, before launching the tests again, can you do a rebase with the main branch of Optimum? Thank you!

It's done @JingyaHuang

Optimizer behavior has changed, so adding a feature argument is not needed anymore!

@chainyo
Copy link
Contributor Author

chainyo commented Sep 15, 2022

I fixed a typo for tests.

I don't know if ORTModelForCustomTasks is the best one, but I didn't find any ORTModelForConditionalGeneration. Or maybe I should import the MT5 specific one?

@JingyaHuang
Copy link
Collaborator

Hi @chainyo, the current ORTModelForCustomTasks doesn't support the sequence-to-sequence model yet, it shall be ORTModelForSeq2SeqLM in this case.

As @customer101 observes, for the moment, there is a problem with the dimension of past-key-value, so you need to set use_cache=False for the time being as a workaround.

Here is the snippet:

from optimum.onnxruntime import ORTModelForSeq2SeqLM
model = ORTModelForSeq2SeqLM.from_pretrained("google/mt5-small", from_transformers=True, use_cache=False)

@JingyaHuang
Copy link
Collaborator

Hi @customer101 ,

Thanks for reporting the issue with use_past. It seems that there is an issue while exporting decoder_with_past.

past_key_values contains pre-computed keys and values for attention which helps to speed up sequential decoding. Can you open an separate issue concerning the bug, so that we can discuss and tackle it together?

@chainyo
Copy link
Contributor Author

chainyo commented Sep 15, 2022

Hi @chainyo, the current ORTModelForCustomTasks doesn't support the sequence-to-sequence model yet, it shall be ORTModelForSeq2SeqLM in this case.

As @customer101 observes, for the moment, there is a problem with the dimension of past-key-value, so you need to set use_cache=False for the time being as a workaround.

Here is the snippet:

from optimum.onnxruntime import ORTModelForSeq2SeqLM
model = ORTModelForSeq2SeqLM.from_pretrained("google/mt5-small", from_transformers=True, use_cache=False)

Thanks, I fixed the code. I didn't see it was separated into two different test functions. It should be good for now.

@chainyo
Copy link
Contributor Author

chainyo commented Sep 19, 2022

Hi @JingyaHuang,

I see there are three problems according to the error tests in the CI.

  • First: about evaluate
ModuleNotFoundError: No module named 'evaluate'
  • Second: about mt5 KeyError
input = '/home/runner/.cache/huggingface/hub/hf-internal-testing/tiny-random-mt5/encoder_model.onnx'
model_type = 'mt5', num_heads = 6, hidden_size = 8
optimization_options = <onnxruntime.transformers.fusion_options.FusionOptions object at 0x7f70215d8f70>
opt_level = 2, use_gpu = False, only_onnxruntime = False

    def optimize_model(
        input: str,
        model_type: str = "bert",
        num_heads: int = 0,
        hidden_size: int = 0,
        optimization_options: Optional[FusionOptions] = None,
        opt_level: int = None,
        use_gpu: bool = False,
        only_onnxruntime: bool = False,
    ):
        """Optimize Model by OnnxRuntime and/or python fusion logic.
    
        ONNX Runtime has graph optimizations (https://onnxruntime.ai/docs/resources/graph-optimizations.html).
        However, the coverage is limited. We also have graph fusions that implemented in Python to improve the coverage.
        They can combined: ONNX Runtime will run first when opt_level > 0, then graph fusions in Python will be applied.
    
        To use ONNX Runtime only and no Python fusion logic, use only_onnxruntime flag and a positive opt_level like
            optimize_model(input, opt_level=1, use_gpu=False, only_onnxruntime=True)
    
        When opt_level is None, we will choose default optimization level according to model type.
    
        When opt_level is 0 and only_onnxruntime is False, only python fusion logic is used and onnxruntime is disabled.
    
        When opt_level > 1, use_gpu shall set properly since the optimized graph might contain operators for GPU or CPU only.
        If your model is intended for GPU inference only (especially float16 or mixed precision model), it is recommended to
        set use_gpu to be True, otherwise the model is not optimized for GPU inference.
    
        For BERT model, num_heads and hidden_size are optional. For other model types, you need specify these parameters.
    
        Args:
            input (str): input model path.
            model_type (str, optional): model type - like bert, bert_tf, bert_keras or gpt2. Defaults to 'bert'.
            num_heads (int, optional): number of attention heads. Defaults to 0.
                                       0 allows detect the parameter from graph automatically (for model_type "bert" only).
            hidden_size (int, optional): hidden size. Defaults to 0.
                                         0 allows detect the parameter from graph automatically (for model_type "bert" only).
            optimization_options (FusionOptions, optional): optimization options that turn on/off some fusions. Defaults to None.
            opt_level (int, optional): onnxruntime graph optimization level (0, 1, 2 or 99) or None. Defaults to None.
                                       When the value is None, default value (1 for bert and gpt2, 0 for other model types) will be used.
                                       When the level > 0, onnxruntime will be used to optimize model first.
            use_gpu (bool, optional): use gpu or not for onnxruntime. Defaults to False.
            only_onnxruntime (bool, optional): only use onnxruntime to optimize model, and no python fusion. Defaults to False.
    
         Returns:
            object of an optimizer class.
        """
        assert opt_level is None or opt_level in [0, 1, 2, 99]
    
        if model_type != "bert" and (num_heads == 0 or hidden_size == 0):
            logger.warning("Please specify parameters of num_heads and hidden_size when model_type is not 'bert'")
    
>       (optimizer_class, producer, default_opt_level) = MODEL_TYPES[model_type]
E       KeyError: 'mt5'

/opt/hostedtoolcache/Python/3.8.13/x[64](https://github.com/huggingface/optimum/actions/runs/3061277350/jobs/4952434167#step:5:65)/lib/python3.8/site-packages/onnxruntime/transformers/optimizer.py:216: KeyError

I don't know if mt5 KeyError is on my own or not?

  • Third: about FakeConfig
ERROR: test_push_to_hub (test_configuration_utils.ConfigPushToHubTester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/optimum/optimum/tests/test_configuration_utils.py", line 74, in test_push_to_hub
    config.save_pretrained(
  File "/home/runner/work/optimum/optimum/optimum/configuration_utils.py", line 77, in save_pretrained
    repo = self._create_or_get_repo(save_directory, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/transformers/configuration_utils.py", line 251, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'FakeConfig' object has no attribute '_create_or_get_repo'

@JingyaHuang JingyaHuang self-assigned this Sep 20, 2022
Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @chainyo,

Thanks for tracking the PR.

To pass the CIs, here are several modifications to apply:

  • For the benchmark suite test, can you rebase your branch? The evaluate module has been added since Add evaluate to setup聽#384
  • For the MT5 support, the name of hidden size shall be d_model.
  • ONNX Runtime optimization supports bert-like, gpt-like and bart-like models. As t5/mt5 are encoder-decoder models, I suggest setting the model type as "bart" to benefit from some extra fusions.
  • And we recently reduced the size of some models to accelerate the CIs, can you use "hf-internal-testing/tiny-random-onnx-mt5" for the test?

With the modifications beyond, the PR should be good to go!

optimum/onnxruntime/utils.py Outdated Show resolved Hide resolved
tests/onnxruntime/test_optimization.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating the PR, LGTM!

@JingyaHuang JingyaHuang merged commit 5f7ac4d into huggingface:main Oct 2, 2022
@chainyo chainyo deleted the add-mt5-to-ortconfigmanager branch December 7, 2022 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MT5 Support
5 participants