Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train model with feat.params (non g2p model) #171

Closed
YashNarK opened this issue Feb 22, 2019 · 5 comments
Closed

train model with feat.params (non g2p model) #171

YashNarK opened this issue Feb 22, 2019 · 5 comments

Comments

@YashNarK
Copy link

I am able to train a g2p model by g2p-seq2seq, but the problem is this model can't be used by pocket sphinx. If tried it always ends with error "RuntimeError: new_Decoder returned -1".

On the other hand, the models which can be used by pocketsphinx can't be trained by g2p-seq2seq.
If tried it ends with error "Exception: File .........\model\en-us\en-us\model.params not exists."

The one main difference I noticed is that pocketsphinx cooperates with models containing feat.params and g2p-seq2seq works on model.params.

Now i need either a g2p model to work under sphinx or a normal model to be trained by using g2p-seq2seq.
Can someone solve this as soon as possible..??? Thanks in advance
VERSION INFO:
g2p-seq2seq 6.2.2a0
tensor2tensor 1.6.6 --> prescribed by many forums as it is the only version working well with g2p-
seq2seq
tensorflow-gpu 1.13.0rc2
pocketsphinx 0.1.15
Python 3.7.2

@nshmyrev
Copy link
Contributor

acoustic model used by pocketsphinx is not the same as g2p model used by g2p-seq2seq.

@YashNarK
Copy link
Author

In that case how can I train my acoustic model .. if I create a corpus file and obtain dictionary and lm for few specific keywords ..the accuracy is good...but number of key words is so less..if I try to increase the number of words, then the accuracy falls down... Is there a way to improve accuracy without decreasing the number of words in default dictionary???
Is there a way to employ tensor flow to improve acoustic model accuracy??

@nshmyrev
Copy link
Contributor

Is there a way to improve accuracy without decreasing the number of words in default dictionary???

Add more training data

Is there a way to employ tensor flow to improve acoustic model accuracy??

No

@YashNarK
Copy link
Author

Training data?? How to do that?? Do u mean about that tutorial on adapting an acoustic model by using arctic.fileids and arctic.transcriptions?
If yes ..then I read through that page many a times and have many doubts regarding it..

  1. can only 20 words or sentences can be used to train in this method (referring to arctic20)
  2. How to do it?? Can u please explain in steps.?
    Assume that I have a long audio file with all utterances of keywords (16000 Hz mono channel) and i ve split them into keywords summing up to 20 or even more audio files..now what exactly I have to do??
    Is this training limited to my voice alone?? Or it improves overall accuracy irrespective of the speaker???
    If training data doesn't mean what I interpret..then what is it?? How to do it...??
    And thanks for ur quick responses..

@jpetso
Copy link

jpetso commented Feb 23, 2019

@YashNarK, I'm not a contributor or maintainer to this project, but I came here to say that it is important for you to understand that your request is off-topic. The people running this project have no obligation to help you figure out how to train acoustic models. How to train acoustic models almost entirely unrelated to what this project is about, and it's inconsiderate of you to ask for information in the wrong place.

Please figure out the right place to ask, or better, find the right documentation on the internet to understand by yourself what the process is, which inputs you need and how to provide those to your training pipeline.

@cmusphinx cmusphinx locked as off-topic and limited conversation to collaborators Feb 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants