Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference 60 characters, cost 0.86 sencond on cpu, how to acceleration time #4

Closed
liuhuang31 opened this issue Jul 25, 2022 · 15 comments

Comments

@liuhuang31
Copy link

liuhuang31 commented Jul 25, 2022

hi,
Thanks the provided code and model.
When i use the g2pw to do g2p, it cost too long time.
conv = G2PWConverter(style='pinyin', enable_non_tradional_chinese=True)

I inference 60 characters, cost 0.86s on cpu, have any way to accelerate it? Thanks again.

@liuhuang31 liuhuang31 changed the title inference 60 characters, cost 5-6 senconds, how to acceleration time inference 60 characters, cost 1.58 sencond, how to acceleration time Jul 25, 2022
@liuhuang31 liuhuang31 changed the title inference 60 characters, cost 1.58 sencond, how to acceleration time inference 60 characters, cost 0.86 sencond, how to acceleration time Jul 25, 2022
@liuhuang31 liuhuang31 changed the title inference 60 characters, cost 0.86 sencond, how to acceleration time inference 60 characters, cost 0.86 sencond on cpu, how to acceleration time Jul 25, 2022
@yt605155624
Copy link

yt605155624 commented Aug 10, 2022

maybe you can convert model to onnx and use onnxruntime, check pr in #5

@liuhuang31
Copy link
Author

谢谢,对g2pw代码进行了修改:dataloader也进行了相应修改,最后改成直接预测一整句话,速度大概在0.07~0.13s左右。也是在paddle下增加了g2pw选项。

@yt605155624
Copy link

yt605155624 commented Aug 10, 2022

nice~ I have printed the time of G2PWOnnxConverter, the first time of using onnxruntime will be slow (it's onnxruntime's feature)
image
I will be appreciated if you can include your perfect improvement into paddlespeech after our onnxruntime version of g2pw merged, and I also looking forward to your pr if you are using paddlespeech TTS :)

input text:

我有长头发,我长高了,头发变得长长的,不想长大,你的头发很长

@GitYCC
Copy link
Owner

GitYCC commented Aug 10, 2022

谢谢,对g2pw代码进行了修改:dataloader也进行了相应修改,最后改成直接预测一整句话,速度大概在0.07~0.13s左右。也是在paddle下增加了g2pw选项。

@liuhuang31
Thanks for your response. The feature of "predicting whole sentence in one shot" sounds interesting.
Could I invite you to give a PR and become a contributor?
Or, if your time is not available, could you show a piece of codes to help us adding this feature by ourself?

@liuhuang31
Copy link
Author

Thanks for your response, i want to give a PR, it is much convenient.

@GitYCC
Copy link
Owner

GitYCC commented Aug 10, 2022

@liuhuang31 Thank you! I am looking forward your PR.

@beyondguo
Copy link

Hi, could you please tell me how to use G2PWOnnxConverter ? I didn't find it in the code.

@liuhuang31
Copy link
Author

Hi, the newest code default use OnnxConverter model to predict, so just install the newest code and use it.

@liuhuang31
Copy link
Author

Hi, could you please tell me how to use G2PWOnnxConverter ? I didn't find it in the code.

解决了嘛,没解决话是又遇到啥问题了呢~ @beyondguo

@beyondguo
Copy link

beyondguo commented Sep 30, 2022

@liuhuang31
Hi!
我是这几天直接pip安装的,应该就是最新版了。但感觉预测速度还是比较慢:

from g2pw import G2PWConverter
conv = G2PWConverter(style='pinyin', enable_non_tradional_chinese=True)

%timeit conv('然而,他红了20年以后,他竟退出了大家的视线。')

平均时长:

701 ms ± 30.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

不是初次加载模型,是每次跑都差不多这个时间。我现在的需求是对一个大语料库做注音,所以希望推理速度快一点。

@liuhuang31
Copy link
Author

@beyondguo 你可以看看我的pull request, 我在旧的版本上,60个字能达到0.08-0.13秒。

@beyondguo
Copy link

beyondguo commented Sep 30, 2022

@liuhuang31
我刚刚下载了你的版本(https://github.com/liuhuang31/g2pW),测试了一下,一样的代码:

>>> %timeit conv('然而,他红了20年以后,他竟退出了大家的视线。')
<<< 1.44 s ± 39.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

然而更慢了,不知道我是哪里用错了吗[Lol]

@liuhuang31
Copy link
Author

@beyondguo 应该没用对,等有时间再跟你讨论,我先忙工作事情,比较紧急

@JohnHerry
Copy link

确实慢,有办法做模型精简吗?onnx也慢。

@liuhuang31
Copy link
Author

liuhuang31 commented Mar 28, 2023

g2pw每次预测会先做句子分词,然后一句话可能会分成10次,那么就要调用10次去预测。
没有做模型精简,我是直接修改代码,直接一句话输入进去,只预测一次,所以速度很快。

可以参考之前的一个代码,大概逻辑是:循环生成数据变成只一次就生成预测数据;模型循环预测变成只一次调用模型。
去年写的代码了,有些细节有点忘了。
https://github.com/liuhuang31/g2pW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants