Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Error when using pre trained embedddings to extract ivector #4496

Closed
TPalawaT opened this issue Apr 14, 2021 · 4 comments
Closed

Getting Error when using pre trained embedddings to extract ivector #4496

TPalawaT opened this issue Apr 14, 2021 · 4 comments

Comments

@TPalawaT
Copy link

I was trying to extract ivector from pre-trained librispeech model and running the script to extract ivectors, I am getting this error. Does anyone have any idea as to why it is happening? Just to mention, I am also getting the same error in my log file with voxceleb model too, if this information helps.
Please let me know in case I need to post additional information, Thanks!

copy-feats --compress=true ark:- ark,scp:24_ivector_extraction/wsj_SI284_dot_cor.v2/ivector_wsj_SI284_dot_cor.v2_online.5.ark,24_ivector_extraction/wsj_SI284_dot_cor.v2/ivector_wsj_SI284_dot_cor.v2_online.5.scp 
ivector-extract-online2 --config=0013_librispeech_v1_extractor/ivector_extractor/../iv_conf/ivector_extractor.conf ark:05_feat/wsj_SI284_dot_cor.v2/split20/5/spk2utt scp:/share/mini1/res/t/asr/studio/read-us/wsj/embeddings/exp/exp_embedding/05_feat/wsj_SI284_dot_cor.v2/split20/5/feats.scp ark:- 
LOG (ivector-extract-online2[5.5.1048~1-d211d]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (ivector-extract-online2[5.5.1048~1-d211d]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
ASSERTION_FAILED (ivector-extract-online2[5.5.1048~1-d211d]:Check():online-ivector-feature.cc:87) Assertion failed: (lda_mat.NumCols() == spliced_input_dim || lda_mat.NumCols() == spliced_input_dim + 1)
@danpovey
Copy link
Contributor

danpovey commented Apr 14, 2021 via email

@TPalawaT
Copy link
Author

Hi, so I made some changes in my script. Earlier, I wasn't using splice-feats.cc binary and then transforming the matrix.
I have now done that, but I am now getting the following warnings.

WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46n-46nc020u_1:0-15744 has bad dimension 40x281 versus feat dim 91
LOG (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:161) Applied transform to 0 utterances; 1666 had errors.
WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46n-46nc020v_1:0-12051 has bad dimension 40x281 versus feat dim 91
WARNING (ivector-extract[5.5.1048~1-d211d]:RunPerSpeaker():ivector-extract.cc:123) No features present for utterance WSJ-F46c-46cc0201_1:0-10267

And then further in the log file, I get the final error.

LOG (apply-cmvn-online[5.5.1048~1-d211d]:main():apply-cmvn-online.cc:133) Applied online CMVN to 1666 files, or 1289059 frames.
WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46p-46pc041f_1:0-8220 has bad dimension 40x281 versus feat dim 91
LOG (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:161) Applied transform to 0 utterances; 1666 had errors.
LOG (gmm-global-get-post[5.5.1048~1-d211d]:main():gmm-global-get-post.cc:115) Done 0 files, 0 with errors, average UBM log-likelihood is -nan over 0 frames.
WARNING (gmm-global-get-post[5.5.1048~1-d211d]:Close():kaldi-io.cc:515) Pipe apply-cmvn-online --spk2utt=ark:05_feat/wsj_SI284_dot_cor.v2/split20/4/spk2utt --config=0013_librispeech_v1_extractor/ivector_extractor/online_cmvn.conf 0013_librispeech_v1_extractor/ivector_extractor/global_cmvn.stats scp:05_feat/wsj_SI284_dot_cor.v2/split20/4/feats.scp ark:- | splice-feats --left-context=3 --right-context=3 ark:- ark:- | transform-feats 0013_librispeech_v1_extractor/ivector_extractor/final.mat ark:- ark:- | had nonzero return status 256
ERROR (gmm-global-get-post[5.5.1048~1-d211d]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive 'apply-cmvn-online --spk2utt=ark:05_feat/wsj_SI284_dot_cor.v2/split20/4/spk2utt --config=0013_librispeech_v1_extractor/ivector_extractor/online_cmvn.conf 0013_librispeech_v1_extractor/ivector_extractor/global_cmvn.stats scp:05_feat/wsj_SI284_dot_cor.v2/split20/4/feats.scp ark:- | splice-feats --left-context=3 --right-context=3 ark:- ark:- | transform-feats 0013_librispeech_v1_extractor/ivector_extractor/final.mat ark:- ark:- |'

I have 30 dimensions in the MFCC features I created vs the features in librispeech model. Could it be because of that?

@danpovey
Copy link
Contributor

danpovey commented Apr 15, 2021 via email

@TPalawaT
Copy link
Author

TPalawaT commented Apr 18, 2021

@danpovey Thank you very much. It was indeed a problem of dimension mismatch. There were a few more errors in the script but it did work in the end.
Thank you very much again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants