Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to run decoding on chain model built by myself #45

Open
kelvinqin opened this issue Feb 24, 2021 · 2 comments
Open

how to run decoding on chain model built by myself #45

kelvinqin opened this issue Feb 24, 2021 · 2 comments

Comments

@kelvinqin
Copy link

Guenter,
Thanks for your work, I successfully compiled and tested your code. And also I can do decoding using your pre-trained model (kaldi-generic-en-tdnn_f-r20190609) now. Great work!

Meanwhile, I am learning Kaldi, and almost finished my first model building using one of Mandarin recipes (aishell). And I just want to know whether it is possible to run decoding task using your code with my own model?

My model is a chain model which was built following the standard recipe (kaldi/egs/aishell/s5/local/chain/run_tdnn.sh)

One possible difficulty I realized is how to collect all the needed data files into model directory, please advice if there is a instruction document on which files are needed and the corresponding source directory of kaldi. By reading your code, seems just collect all the files mentioned in the following code:

    ****cdef unicode mfcc_config           = u'%s/conf/mfcc_hires.conf'                  % self.modeldir
    cdef unicode word_symbol_table     = u'%s/%s/graph/words.txt'                    % (self.modeldir, self.model)
    cdef unicode model_in_filename     = u'%s/%s/final.mdl'                          % (self.modeldir, self.model)
    cdef unicode splice_conf_filename  = u'%s/ivectors_test_hires/conf/splice.conf'  % self.modeldir
    cdef unicode fst_in_str            = u'%s/%s/graph/HCLG.fst'                     % (self.modeldir, self.model)
    cdef unicode align_lex_filename    = u'%s/%s/graph/phones/align_lexicon.int'     % (self.modeldir, self.model)**

    **self.ie_conf_f.write((u"--cmvn-config=%s/conf/online_cmvn.conf\n" % self.modeldir).encode('utf8'))**
    self.ie_conf_f.write((u"--ivector-period=%d\n" % online_ivector_period).encode('utf8'))
    **self.ie_conf_f.write((u"--splice-config=%s\n" % splice_conf_filename).encode('utf8'))**
    **self.ie_conf_f.write((u"--lda-matrix=%s/extractor/final.mat\n" % self.modeldir).encode('utf8'))
    self.ie_conf_f.write((u"--global-cmvn-stats=%s/extractor/global_cmvn.stats\n" % self.modeldir).encode('utf8'))
    self.ie_conf_f.write((u"--diag-ubm=%s/extractor/final.dubm\n" % self.modeldir).encode('utf8'))**
    **self.ie_conf_f.write((u"--ivector-extractor=%s/extractor/final.ie\n" % self.modeldir).encode('utf8'))**
    self.ie_conf_f.write((u"--num-gselect=%d\n" % num_gselect).encode('utf8'))
    self.ie_conf_f.write((u"--min-post=%f\n" % min_post).encode('utf8'))
    self.ie_conf_f.write((u"--posterior-scale=%f\n" % posterior_scale).encode('utf8'))
    self.ie_conf_f.write((u"--max-remembered-frames=1000\n").encode('utf8'))
    self.ie_conf_f.write((u"--max-count=%d\n" % max_count).encode('utf8'))
    self.ie_conf_f.flush()**

Could you kindly elaborate a little about the source directory of those needed files?

Thanks!
Kelvin

@kelvinqin
Copy link
Author

Guenter,
Have figure it out, :-) thanks!
Kelvin

@svenha
Copy link

svenha commented Feb 25, 2021

@kelvinqin As aishell is a popular recipe, would you mind sharing your solution? :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants