Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where to get the dictionary of vosk-model-en-us-0.22-lgraph #1555

Open
YangangCao opened this issue Apr 8, 2024 · 8 comments
Open

where to get the dictionary of vosk-model-en-us-0.22-lgraph #1555

YangangCao opened this issue Apr 8, 2024 · 8 comments

Comments

@YangangCao
Copy link

Hi, dear author, I want to get the dictionary of vosk-model-en-us-0.22-lgraph to check every phone in a word, where can I get? Thanks very much

@nshmyrev
Copy link
Collaborator

nshmyrev commented Apr 8, 2024

It is inside the compilation package

https://alphacephei.com/vosk/models/vosk-model-en-us-0.22-compile.zip

@YangangCao
Copy link
Author

ok got it, thanks for your quick and accurate reply!

@YangangCao
Copy link
Author

YangangCao commented Apr 8, 2024

Hi, sorry to bother you, I find some phone which is hard to read, for example
electromagnetic electromagnetic @_B l_I E_I k_I t_I r_I oU_I m_I {_I g_I n_I E_I 4_I I_I k_E
electromagnetic electromagnetic @_B l_I E_I k_I t_I r_I oU_I m_I {_I g_I n_I E_I t_I I_I k_E
electromagnetic electromagnetic I_B l_I E_I k_I 4_I r_I oU_I m_I {_I g_I n_I E_I 4_I I_I k_E
electromagnetic electromagnetic I_B l_I E_I k_I t_I r_I oU_I m_I {_I g_I n_I E_I t_I I_I k_E

what's 4_I ?
And the word "electromagnetic" have 4 kinds of phone arrangements, I want to calculate GOP(goodness of pronunciation), how to decide the only one phone arrangement?

@nshmyrev
Copy link
Collaborator

nshmyrev commented Apr 9, 2024

what's 4_I ?

4 is a sampa phone something like "ch".

_I is for word internal, you are probably looking inside intermediate lexicon instead of original one.

I want to calculate GOP(goodness of pronunciation), how to decide the only one phone arrangement?

run alignment

@YangangCao
Copy link
Author

thanks for your reply, it help me a lot

@YangangCao
Copy link
Author

YangangCao commented Apr 26, 2024

Hi dear author, why the vosk use different phone system with Kaldi?
for example, "@", "{", "4" in vosk model, but doesn't in Kaldi model,
Any idea to map from one to the other?

@nshmyrev
Copy link
Collaborator

That particular model uses different phoneset unfortunately. You can still map it easily, it is a simple mapping. Other models like gigaspeech use standard cmu dictionary

@YangangCao
Copy link
Author

Thanks for you quick and accurate reply, I know gigaspeech model right know and it is good enough to me, I don't plan to map phoneset! thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants