Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[S2T]Win10上运行程序无法获取结果 #2387

Closed
dsyrock opened this issue Sep 15, 2022 · 12 comments
Closed

[S2T]Win10上运行程序无法获取结果 #2387

dsyrock opened this issue Sep 15, 2022 · 12 comments
Assignees
Labels

Comments

@dsyrock
Copy link

dsyrock commented Sep 15, 2022

想在Win平台上离线运行

用如下代码:

from paddlespeech.cli.asr.infer import ASRExecutor
asr = ASRExecutor()
result = asr(audio_file="k:\test3.wav")
print(result)

或者命令行:
paddlespeech asr --lang zh --input test3.wav

最后没有print出任何结果。屏幕上唯一有的信息是
2022-09-15 01:05:46.604 | INFO | paddlespeech.s2t.modules.ctc:<module>:45 - paddlespeech_ctcdecoders not installed! 2022-09-15 01:05:46.749 | INFO | paddlespeech.s2t.modules.embedding:__init__:153 - max len: 5000
test3.wav是一个单声道,16000采样率,不足5秒的纯人声音频,应该符合条件的

@yt605155624
Copy link
Collaborator

可用我们提供的音频试一下

wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

确定到底是音频的问题还是环境的问题

@JiaXiao243
Copy link
Contributor

JiaXiao243 commented Sep 15, 2022

@dsyrock 使用zh.wav,paddle=2.3.2,paddlespeech=1.1.3,我这边是可以正常使用的
b3bde86c8bee113e03cac9ca1db38d4e
可以提供一下你的音频文件吗?
wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
paddlespeech asr --lang zh --input zh.wav

@dsyrock
Copy link
Author

dsyrock commented Sep 15, 2022

我用自己的跟示例里的zh.wav结果都是一样,print不出任何结果。在线版里的倒是能正常识别自己的跟示例的。不知道我本地运行的时候缺了些什么

@yt605155624
Copy link
Collaborator

yt605155624 commented Sep 15, 2022

可能是你个人环境的问题,目前 cli 信息较少,可以修改安装的源码 (位置在 你的python 环境/lib/(你的python版本)/site-packages/paddlespeech)

self.logger.setLevel(logging.INFO)
logging.DEBUG 查看更多信息,或者自己打印更多信息,程序入口是 https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/paddlespeech/cli/asr/infer.py

@dsyrock
Copy link
Author

dsyrock commented Sep 15, 2022

感谢,我马上试试看

@dsyrock
Copy link
Author

dsyrock commented Sep 15, 2022

请问这个log.py的52行,应该怎么修改,才能显示更多信息呢?我现在的内容就已经是
self.logger.setLevel(logging.INFO)

@yt605155624
Copy link
Collaborator

修改为 self.logger.setLevel(logging.DEBUG)

@dsyrock
Copy link
Author

dsyrock commented Sep 15, 2022

谢谢,我已经按你说的修改了。但奇怪,命令行运行的结果仍然是只有我上面说的那两个信息,没报错也没打印结果

用代码方式的话就多了一些信息

[2022-09-15 18:53:17,322] [ DEBUG] - start to init the model
[2022-09-15 18:53:17,323] [ DEBUG] - File C:\Users\Noiz.paddlespeech\models\conformer_wenetspeech-zh-16k\1.0\asr1_conformer_wenetspeech_ckpt_0.1.1.model.tar.gz md5 checking...
[2022-09-15 18:53:23,196] [ DEBUG] - C:\Users\Noiz.paddlespeech\models\conformer_wenetspeech-zh-16k\1.0\asr1_conformer_wenetspeech_ckpt_0.1.1.model.tar
[2022-09-15 18:53:23,196] [ DEBUG] - C:\Users\Noiz.paddlespeech\models\conformer_wenetspeech-zh-16k\1.0\asr1_conformer_wenetspeech_ckpt_0.1.1.model.tar\model.yaml
[2022-09-15 18:53:23,197] [ DEBUG] - C:\Users\Noiz.paddlespeech\models\conformer_wenetspeech-zh-16k\1.0\asr1_conformer_wenetspeech_ckpt_0.1.1.model.tar\exp\conformer\checkpoints\wenetspeech.pdparams

2022-09-15 18:53:23.643 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed!
2022-09-15 18:53:23.797 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000
[2022-09-15 18:53:33,178] [ DEBUG] - The asr server limit max duration len: 200.0
[2022-09-15 18:53:33,179] [ DEBUG] - checking the audio file format......
[2022-09-15 18:53:33,180] [ DEBUG] - The sample rate is 16000
[2022-09-15 18:53:33,181] [ DEBUG] - The audio file format is right
[2022-09-15 18:53:33,181] [ DEBUG] - Preprocess audio_file:k:\zh.wav
[2022-09-15 18:53:33,182] [ DEBUG] - get the preprocess conf
[2022-09-15 18:53:33,184] [ DEBUG] - read the audio file
[2022-09-15 18:53:33,185] [ DEBUG] - audio shape: (79949,)
[2022-09-15 18:53:33,278] [ DEBUG] - audio feat shape: [1, 498, 80]
[2022-09-15 18:53:33,279] [ DEBUG] - audio feat process success
[2022-09-15 18:53:33,280] [ DEBUG] - start to infer the model to get the output
[2022-09-15 18:53:33,281] [ DEBUG] - we will use the transformer like model : conformer_wenetspeech

@yt605155624
Copy link
Collaborator

cli 外面包了一层,需要改一下这里

l.setLevel(logging.ERROR)

你可以直接在代码里用 print 的方式进行 debug

@dsyrock
Copy link
Author

dsyrock commented Sep 15, 2022

主要还是因为我没认真看安装文档。上面说推荐安装paddlespeech的时候,用清华的源,我没看清,直接用了百度的源。我用清华的源强制重装一次就可以了,感激!

@younghuvee
Copy link

可用我们提供的音频试一下

wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

确定到底是音频的问题还是环境的问题

我也是一样的情况 我使用官方给的语音 可以正常识别,自己录的就识别不出,windows环境,请问一下对录音文件的格式有什么要求么

@Chuyaoyuan
Copy link

@younghuvee 需要确认音频文件的采样率、⽐特率是否和官方提供的音频文件一致,不一致的转换下在识别

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants