Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to convert Facebook/mbart-many-to-many model to onxx #10420

Closed
sankarsiva123 opened this issue Feb 26, 2021 · 6 comments
Closed

Unable to convert Facebook/mbart-many-to-many model to onxx #10420

sankarsiva123 opened this issue Feb 26, 2021 · 6 comments

Comments

@sankarsiva123
Copy link

When I tried to convert Facebook/mbart-many-to-many model . I am unable to convert I am getting issue.
Pls help me to convert this model to ONXX

@LysandreJik
Copy link
Member

I do not think mBART can be converted to ONNX as of now.

@sankarsiva123
Copy link
Author

Hi Thanks for the information.
Facebook/many-to-many takes 9s seconds for translation on cpu , is there a way to reduce the inference time ?

@Narsil
Copy link
Contributor

Narsil commented Mar 2, 2021

Hi @sankarsiva123, have you tried HF's API inference ?

9s per inference seems a bit off: https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt?text=Hello+there+%21+
We do run some optimizations there as HF's hosted API but still it seems like you could have better inference times than 9s.

Maybe it depends on what you are sending it ? Are you using GPU or CPU ?

@sankarsiva123
Copy link
Author

sankarsiva123 commented Mar 3, 2021

Hi, @Narsil Yeah, I tried HF's API inference, it is pretty much fast.
I am using CPU, I tried both in google colab, and in my local, it is taking around 9s.
Am I missing something while using the model, so my inference time is high than normal?
Also pls let me know is there a way to reduce inference time?
image

@Narsil
Copy link
Contributor

Narsil commented Mar 3, 2021

Can you time your inner loop without the tokenizer ? (Just making sure it's not that).

Otherwise you see to use generate, which is the right way to go.
I don't know colab's CPU nor yours, but it could definitely be the problem (or the pytorch version you're rolling which might have not been optimized for your CPU instruction set.)

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants