Fix GPT-J onnx conversion #16780

chainyo · 2022-04-14T11:49:09Z

What does this PR do?

Fix some problems encountered while converting a GPT-J model to Onnx.
Thanks to @ri938 who found where to fix bugs (on 🤗 Discord).

Models:
gpt2: @patrickvonplaten, @LysandreJik
and @lewtun because you reviewed the first PR for GPT-J Onnx Config, here #16274

I'm currently uploading a fully converted EleutherAI/gpt-j-6B model to the hub which demonstrate that the conversion command line worked with these fixes. Find it here

Here is the command I used (I had to fix atol to 1e-04 because 1e-05 was not true while validating the model):

python -m transformers.onnx --model=EleutherAI/gpt-j-6B --feature=causal-lm --atol=1e-04 onnx/

src/transformers/models/gptj/modeling_gptj.py

HuggingFaceDocBuilderDev · 2022-04-14T12:03:17Z

The documentation is not available anymore as the PR was closed or merged.

lewtun

Thanks for the fix @chainyo! This change to sinusoid_inp looks good to me, but I'd like @patil-suraj to comment on whether he thinks it would have any negative consequences

src/transformers/models/gptj/modeling_gptj.py

patil-suraj

Looks good, thanks for the fix!

src/transformers/models/gptj/modeling_gptj.py

chainyo · 2022-04-19T15:19:07Z

Fine I'm going to accept @lewtun 's suggestion. 🤗

I see there are some failing checks, and I also see that there are problems with linting is it normal ?

patil-suraj · 2022-04-19T15:34:35Z

The run_tests_torch_and_tf failure is unrelated, to fix the check_code_quality test, run the make fixup command and push again.

chainyo · 2022-04-19T16:29:25Z

The run_tests_torch_and_tf failure is unrelated, to fix the check_code_quality test, run the make fixup command and push again.

Last time I tried to run make fixup it changed linting on more than 87 files not related to the PR so I reverted the fixup

lsn1106 · 2022-04-20T11:09:25Z

@chainyo
hi, thanks for your contribution. I tested it with my basic gptj model and I think it's working pretty well.

But I don't think it's working well when I tested it with a model that was extracted using a method called 'use cache' or 'use past'. Can you give me an example or check if there's anything wrong with my code??

Below is the test i did

ort_session = onnxruntime.InferenceSession(onnx_path, providers=['CUDAExecutionProvider'])

#check session's input
for ort_session_input in ort_session.get_inputs():
    print(ort_session_input.name, ort_session_input.shape, ort_session_input.type)
#input_ids ['batch', 'sequence'] tensor(int64)
#past_key_values.0.key ['batch', 16, 'past_sequence + sequence', 256] tensor(float)
#past_key_values.0.value ['batch', 16, 'past_sequence + sequence', 256] tensor(float) 
#...
#past_key_values.27.key ['batch', 16, 'past_sequence + sequence', 256] tensor(float)
#past_key_values.27.value ['batch', 16, 'past_sequence + sequence', 256] tensor(float) 
#attention_mask ['batch', 'past_sequence + sequence'] tensor(float)

input_txt_list = [
    'text for test', 
    'gptj'
]
ort_input = make_onnx_inputs(input_txt_list)
for k,v in ort_input.items():
    print(k,v.size(),v.dtype)
#input_ids torch.Size([2, 3]) torch.int64
#past_key_values.0.key torch.Size([2, 16, 0, 256]) torch.float32
#past_key_values.0.value torch.Size([2, 16, 0, 256]) torch.float32
#...
#past_key_values.27.key torch.Size([2, 16, 0, 256]) torch.float32
#past_key_values.27.value torch.Size([2, 16, 0, 256]) torch.float32
#attention_mask torch.Size([2, 3]) torch.float32

#TypeError
ort_output = ort_session.run(None, ort_input)

And this is the error message

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_1166/2757688042.py in <module>
----> 1 ort_output = ort_session.run(None, ort_input)

/opt/conda/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py in run(self, output_names, input_feed, run_options)
    190             output_names = [output.name for output in self._outputs_meta]
    191         try:
--> 192             return self._sess.run(output_names, input_feed, run_options)
    193         except C.EPFail as err:
    194             if self._enable_fallback:

TypeError: run(): incompatible function arguments. The following argument types are supported:
    1. (self: onnxruntime.capi.onnxruntime_pybind11_state.InferenceSession, arg0: List[str], arg1: Dict[str, object], arg2: onnxruntime.capi.onnxruntime_pybind11_state.RunOptions) -> List[object]

Invoked with: <onnxruntime.capi.onnxruntime_pybind11_state.InferenceSession object at 0x7f319b7e3030>, ['logits', 'present.0.key', 'present.0.value', 'present.1.key', ...'present.27.key', 'present.27.value'], {{'input_ids': tensor([[24496,  1956, 15560], [    3,     3, 18566]]), 'past_key_values.0.key': tensor([], size=(2, 16, 0, 256)), 'past_key_values.0.value': tensor([], size=(2, 16, 0, 256)), 
...
'past_key_values.27.key': tensor([], size=(2, 16, 0, 256)), 'past_key_values.27.value': tensor([], size=(2, 16, 0, 256)), 'attention_mask': tensor([[1.,1.,1.],[0.,0.,1.]])}, None

chainyo · 2022-04-20T12:39:01Z

@chainyo hi, thanks for your contribution. I tested it with my basic gptj model and I think it's working pretty well.

But I don't think it's working well when I tested it with a model that was extracted using a method called 'use cache' or 'use past'. Can you give me an example or check if there's anything wrong with my code??

Hi @lsn1106 could you try to add your model to netron.app and check the expected inputs by clicking on the first layer ?

I'm not sure, but maybe use_past is a feature that is only implemented under the hood in the Transformers library which is not available in the Onnxruntime library.

lsn1106 · 2022-04-20T12:44:45Z

@chainyo
Thank you for your kind advice. Maybe I should refer to this �github code [link]

lewtun · 2022-04-20T13:28:08Z

Hey @lsn1106 looking at your error in ORT

TypeError: run(): incompatible function arguments. The following argument types are supported:

it seems that you're not passing inputs with the correct types. What happens if you cast your inputs to NumPy arrays and ensure that ort_input truly is a dict?

lsn1106 · 2022-04-20T13:31:32Z

@lewtun
i've already tried that but same error occured. thank you :)

lewtun · 2022-04-20T13:44:07Z

Last time I tried to run make fixup it changed linting on more than 87 files not related to the PR so I reverted the fixup

Maybe you can rebase on main and run make fixup again? I'm not entirely sure why it should lint so many files, but this might resolve the problem

lewtun · 2022-04-20T14:19:53Z

@lsn1106 would you mind sharing a reproducible code snippet that shows how you export the model, are creating the inputs for ORT, etc?

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

chainyo · 2022-04-20T16:42:02Z

Last time I tried to run make fixup it changed linting on more than 87 files not related to the PR so I reverted the fixup

Maybe you can rebase on main and run make fixup again? I'm not entirely sure why it should lint so many files, but this might resolve the problem

Well it seems to be solved, thanks!

lewtun · 2022-04-20T17:04:28Z

I think the last thing we need to do is run make style && make quality and then this should be good to go 🚀 !

chainyo · 2022-04-20T17:16:45Z

I think the last thing we need to do is run make style && make quality and then this should be good to go rocket !

Yes sorry the first make fixup didn't run black, it should be good now!

lewtun · 2022-04-21T13:55:11Z

Great, thanks for fixing the style issues! Merging this since the issue reported by @lsn1106 is unrelated to the fix provided by this PR

* add gptj to TOKENIZER_MAPPING_NAMES * fix int32 to float to avoid problem in onnx * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

chainyo commented Apr 14, 2022

View reviewed changes

src/transformers/models/gptj/modeling_gptj.py Outdated Show resolved Hide resolved

chainyo mentioned this pull request Apr 14, 2022

ONNXConfig: Add a configuration for all available models #16308

Closed

patrickvonplaten requested review from patil-suraj and lewtun and removed request for patil-suraj April 18, 2022 15:00

lewtun approved these changes Apr 19, 2022

View reviewed changes

src/transformers/models/gptj/modeling_gptj.py Outdated Show resolved Hide resolved

patil-suraj approved these changes Apr 19, 2022

View reviewed changes

src/transformers/models/gptj/modeling_gptj.py Outdated Show resolved Hide resolved

chainyo and others added 4 commits April 20, 2022 18:36

fix typo for features

e5b71d5

add gptj to TOKENIZER_MAPPING_NAMES

6ebd57b

fix int32 to float to avoid problem in onnx

7d4a750

Update src/transformers/models/gptj/modeling_gptj.py

8417da1

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

chainyo force-pushed the fix-gptj-onnx-conversion branch from d61afbe to 8417da1 Compare April 20, 2022 16:40

fixup

29536f4

lewtun merged commit 0b1e0fc into huggingface:main Apr 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GPT-J onnx conversion #16780

Fix GPT-J onnx conversion #16780

chainyo commented Apr 14, 2022 •

edited

HuggingFaceDocBuilderDev commented Apr 14, 2022 •

edited

lewtun left a comment

patil-suraj left a comment

chainyo commented Apr 19, 2022 •

edited

patil-suraj commented Apr 19, 2022

chainyo commented Apr 19, 2022

lsn1106 commented Apr 20, 2022 •

edited

chainyo commented Apr 20, 2022

lsn1106 commented Apr 20, 2022

lewtun commented Apr 20, 2022

lsn1106 commented Apr 20, 2022

lewtun commented Apr 20, 2022

lewtun commented Apr 20, 2022

chainyo commented Apr 20, 2022

lewtun commented Apr 20, 2022

chainyo commented Apr 20, 2022

lewtun commented Apr 21, 2022

Fix GPT-J onnx conversion #16780

Fix GPT-J onnx conversion #16780

Conversation

chainyo commented Apr 14, 2022 • edited

What does this PR do?

HuggingFaceDocBuilderDev commented Apr 14, 2022 • edited

lewtun left a comment

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

chainyo commented Apr 19, 2022 • edited

patil-suraj commented Apr 19, 2022

chainyo commented Apr 19, 2022

lsn1106 commented Apr 20, 2022 • edited

chainyo commented Apr 20, 2022

lsn1106 commented Apr 20, 2022

lewtun commented Apr 20, 2022

lsn1106 commented Apr 20, 2022

lewtun commented Apr 20, 2022

lewtun commented Apr 20, 2022

chainyo commented Apr 20, 2022

lewtun commented Apr 20, 2022

chainyo commented Apr 20, 2022

lewtun commented Apr 21, 2022

chainyo commented Apr 14, 2022 •

edited

HuggingFaceDocBuilderDev commented Apr 14, 2022 •

edited

chainyo commented Apr 19, 2022 •

edited

lsn1106 commented Apr 20, 2022 •

edited