-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GPT-J onnx conversion #16780
Fix GPT-J onnx conversion #16780
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @chainyo! This change to sinusoid_inp
looks good to me, but I'd like @patil-suraj to comment on whether he thinks it would have any negative consequences
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for the fix!
Fine I'm going to accept @lewtun 's suggestion. 馃 I see there are some failing checks, and I also see that there are problems with linting is it normal ? |
The |
Last time I tried to run |
@chainyo But I don't think it's working well when I tested it with a model that was extracted using a method called 'use cache' or 'use past'. Can you give me an example or check if there's anything wrong with my code?? Below is the test i did ort_session = onnxruntime.InferenceSession(onnx_path, providers=['CUDAExecutionProvider'])
#check session's input
for ort_session_input in ort_session.get_inputs():
print(ort_session_input.name, ort_session_input.shape, ort_session_input.type)
#input_ids ['batch', 'sequence'] tensor(int64)
#past_key_values.0.key ['batch', 16, 'past_sequence + sequence', 256] tensor(float)
#past_key_values.0.value ['batch', 16, 'past_sequence + sequence', 256] tensor(float)
#...
#past_key_values.27.key ['batch', 16, 'past_sequence + sequence', 256] tensor(float)
#past_key_values.27.value ['batch', 16, 'past_sequence + sequence', 256] tensor(float)
#attention_mask ['batch', 'past_sequence + sequence'] tensor(float)
input_txt_list = [
'text for test',
'gptj'
]
ort_input = make_onnx_inputs(input_txt_list)
for k,v in ort_input.items():
print(k,v.size(),v.dtype)
#input_ids torch.Size([2, 3]) torch.int64
#past_key_values.0.key torch.Size([2, 16, 0, 256]) torch.float32
#past_key_values.0.value torch.Size([2, 16, 0, 256]) torch.float32
#...
#past_key_values.27.key torch.Size([2, 16, 0, 256]) torch.float32
#past_key_values.27.value torch.Size([2, 16, 0, 256]) torch.float32
#attention_mask torch.Size([2, 3]) torch.float32
#TypeError
ort_output = ort_session.run(None, ort_input) And this is the error message
|
Hi @lsn1106 could you try to add your model to netron.app and check the expected inputs by clicking on the first layer ? I'm not sure, but maybe |
Hey @lsn1106 looking at your error in ORT
it seems that you're not passing inputs with the correct types. What happens if you cast your inputs to NumPy arrays and ensure that |
@lewtun |
Maybe you can rebase on |
@lsn1106 would you mind sharing a reproducible code snippet that shows how you export the model, are creating the inputs for ORT, etc? |
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
d61afbe
to
8417da1
Compare
Well it seems to be solved, thanks! |
I think the last thing we need to do is run |
Yes sorry the first |
Great, thanks for fixing the style issues! Merging this since the issue reported by @lsn1106 is unrelated to the fix provided by this PR |
* add gptj to TOKENIZER_MAPPING_NAMES * fix int32 to float to avoid problem in onnx * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* add gptj to TOKENIZER_MAPPING_NAMES * fix int32 to float to avoid problem in onnx * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
What does this PR do?
Fix some problems encountered while converting a GPT-J model to Onnx.
Thanks to @ri938 who found where to fix bugs (on 馃 Discord).
Models:
gpt2: @patrickvonplaten, @LysandreJik
and @lewtun because you reviewed the first PR for GPT-J Onnx Config, here #16274
I'm currently uploading a fully converted
EleutherAI/gpt-j-6B
model to the hub which demonstrate that the conversion command line worked with these fixes. Find it hereHere is the command I used (I had to fix
atol
to 1e-04 because 1e-05 was not true while validating the model):