Getting Error when using pre trained embedddings to extract ivector #4496

TPalawaT · 2021-04-14T08:57:07Z

I was trying to extract ivector from pre-trained librispeech model and running the script to extract ivectors, I am getting this error. Does anyone have any idea as to why it is happening? Just to mention, I am also getting the same error in my log file with voxceleb model too, if this information helps.
Please let me know in case I need to post additional information, Thanks!

copy-feats --compress=true ark:- ark,scp:24_ivector_extraction/wsj_SI284_dot_cor.v2/ivector_wsj_SI284_dot_cor.v2_online.5.ark,24_ivector_extraction/wsj_SI284_dot_cor.v2/ivector_wsj_SI284_dot_cor.v2_online.5.scp 
ivector-extract-online2 --config=0013_librispeech_v1_extractor/ivector_extractor/../iv_conf/ivector_extractor.conf ark:05_feat/wsj_SI284_dot_cor.v2/split20/5/spk2utt scp:/share/mini1/res/t/asr/studio/read-us/wsj/embeddings/exp/exp_embedding/05_feat/wsj_SI284_dot_cor.v2/split20/5/feats.scp ark:- 
LOG (ivector-extract-online2[5.5.1048~1-d211d]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (ivector-extract-online2[5.5.1048~1-d211d]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
ASSERTION_FAILED (ivector-extract-online2[5.5.1048~1-d211d]:Check():online-ivector-feature.cc:87) Assertion failed: (lda_mat.NumCols() == spliced_input_dim || lda_mat.NumCols() == spliced_input_dim + 1)

The text was updated successfully, but these errors were encountered:

danpovey · 2021-04-14T14:52:08Z

Likely either a feature dimension mismatch (13 vs. 40), or a splicing mismatch, e.g. the number of frames to splice is wrongly specified. If the latter, it would likely mean you had generated the config files for ivector stuff with the wrong inputs.

…

On Wed, Apr 14, 2021 at 4:57 PM Tushar Palawat ***@***.***> wrote: I was trying to extract ivector from pre-trained librispeech model and running the script to extract ivectors, I am getting this error. Does anyone have any idea as to why it is happening? Just to mention, I am also getting the same error in my log file with voxceleb model too, if this information helps. Please let me know in case I need to post additional information, Thanks! copy-feats --compress=true ark:- ark,scp:24_ivector_extraction/wsj_SI284_dot_cor.v2/ivector_wsj_SI284_dot_cor.v2_online.5.ark,24_ivector_extraction/wsj_SI284_dot_cor.v2/ivector_wsj_SI284_dot_cor.v2_online.5.scp ivector-extract-online2 --config=0013_librispeech_v1_extractor/ivector_extractor/../iv_conf/ivector_extractor.conf ark:05_feat/wsj_SI284_dot_cor.v2/split20/5/spk2utt scp:/share/mini1/res/t/asr/studio/read-us/wsj/embeddings/exp/exp_embedding/05_feat/wsj_SI284_dot_cor.v2/split20/5/feats.scp ark:- LOG (ivector-extract-online2[5.5.1048~1-d211d]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor LOG (ivector-extract-online2[5.5.1048~1-d211d]:ComputeDerivedVars():ivector-extractor.cc:204) Done. ASSERTION_FAILED (ivector-extract-online2[5.5.1048~1-d211d]:Check():online-ivector-feature.cc:87) Assertion failed: (lda_mat.NumCols() == spliced_input_dim || lda_mat.NumCols() == spliced_input_dim + 1) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4496>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLOZKALRW36MJM2TGNL3TIVKHRANCNFSM4246HQOA> .

TPalawaT · 2021-04-15T11:17:37Z

Hi, so I made some changes in my script. Earlier, I wasn't using splice-feats.cc binary and then transforming the matrix.
I have now done that, but I am now getting the following warnings.

WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46n-46nc020u_1:0-15744 has bad dimension 40x281 versus feat dim 91
LOG (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:161) Applied transform to 0 utterances; 1666 had errors.
WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46n-46nc020v_1:0-12051 has bad dimension 40x281 versus feat dim 91
WARNING (ivector-extract[5.5.1048~1-d211d]:RunPerSpeaker():ivector-extract.cc:123) No features present for utterance WSJ-F46c-46cc0201_1:0-10267

And then further in the log file, I get the final error.

LOG (apply-cmvn-online[5.5.1048~1-d211d]:main():apply-cmvn-online.cc:133) Applied online CMVN to 1666 files, or 1289059 frames.
WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46p-46pc041f_1:0-8220 has bad dimension 40x281 versus feat dim 91
LOG (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:161) Applied transform to 0 utterances; 1666 had errors.
LOG (gmm-global-get-post[5.5.1048~1-d211d]:main():gmm-global-get-post.cc:115) Done 0 files, 0 with errors, average UBM log-likelihood is -nan over 0 frames.
WARNING (gmm-global-get-post[5.5.1048~1-d211d]:Close():kaldi-io.cc:515) Pipe apply-cmvn-online --spk2utt=ark:05_feat/wsj_SI284_dot_cor.v2/split20/4/spk2utt --config=0013_librispeech_v1_extractor/ivector_extractor/online_cmvn.conf 0013_librispeech_v1_extractor/ivector_extractor/global_cmvn.stats scp:05_feat/wsj_SI284_dot_cor.v2/split20/4/feats.scp ark:- | splice-feats --left-context=3 --right-context=3 ark:- ark:- | transform-feats 0013_librispeech_v1_extractor/ivector_extractor/final.mat ark:- ark:- | had nonzero return status 256
ERROR (gmm-global-get-post[5.5.1048~1-d211d]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive 'apply-cmvn-online --spk2utt=ark:05_feat/wsj_SI284_dot_cor.v2/split20/4/spk2utt --config=0013_librispeech_v1_extractor/ivector_extractor/online_cmvn.conf 0013_librispeech_v1_extractor/ivector_extractor/global_cmvn.stats scp:05_feat/wsj_SI284_dot_cor.v2/split20/4/feats.scp ark:- | splice-feats --left-context=3 --right-context=3 ark:- ark:- | transform-feats 0013_librispeech_v1_extractor/ivector_extractor/final.mat ark:- ark:- |'

I have 30 dimensions in the MFCC features I created vs the features in librispeech model. Could it be because of that?

danpovey · 2021-04-15T16:12:15Z

You shouldn't be using splice-feats and should not have to change the script. This like likely about regular (13-dim) vs. "hires" (40-dim) MFCCs.

…

On Thu, Apr 15, 2021 at 7:17 PM Tushar Palawat ***@***.***> wrote: Hi, so I made some changes in my script. Earlier, I wasn't using splice-feats.cc binary and then transforming the matrix. I have now done that, but I am now getting the following warnings. WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46n-46nc020u_1:0-15744 has bad dimension 40x281 versus feat dim 91 LOG (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:161) Applied transform to 0 utterances; 1666 had errors. WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46n-46nc020v_1:0-12051 has bad dimension 40x281 versus feat dim 91 WARNING (ivector-extract[5.5.1048~1-d211d]:RunPerSpeaker():ivector-extract.cc:123) No features present for utterance WSJ-F46c-46cc0201_1:0-10267 And then further in the log file, I get the final error. LOG (apply-cmvn-online[5.5.1048~1-d211d]:main():apply-cmvn-online.cc:133) Applied online CMVN to 1666 files, or 1289059 frames. WARNING (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:110) Transform matrix for utterance WSJ-F46p-46pc041f_1:0-8220 has bad dimension 40x281 versus feat dim 91 LOG (transform-feats[5.5.1048~1-d211d]:main():transform-feats.cc:161) Applied transform to 0 utterances; 1666 had errors. LOG (gmm-global-get-post[5.5.1048~1-d211d]:main():gmm-global-get-post.cc:115) Done 0 files, 0 with errors, average UBM log-likelihood is -nan over 0 frames. WARNING (gmm-global-get-post[5.5.1048~1-d211d]:Close():kaldi-io.cc:515) Pipe apply-cmvn-online --spk2utt=ark:05_feat/wsj_SI284_dot_cor.v2/split20/4/spk2utt --config=0013_librispeech_v1_extractor/ivector_extractor/online_cmvn.conf 0013_librispeech_v1_extractor/ivector_extractor/global_cmvn.stats scp:05_feat/wsj_SI284_dot_cor.v2/split20/4/feats.scp ark:- | splice-feats --left-context=3 --right-context=3 ark:- ark:- | transform-feats 0013_librispeech_v1_extractor/ivector_extractor/final.mat ark:- ark:- | had nonzero return status 256 ERROR (gmm-global-get-post[5.5.1048~1-d211d]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive 'apply-cmvn-online --spk2utt=ark:05_feat/wsj_SI284_dot_cor.v2/split20/4/spk2utt --config=0013_librispeech_v1_extractor/ivector_extractor/online_cmvn.conf 0013_librispeech_v1_extractor/ivector_extractor/global_cmvn.stats scp:05_feat/wsj_SI284_dot_cor.v2/split20/4/feats.scp ark:- | splice-feats --left-context=3 --right-context=3 ark:- ark:- | transform-feats 0013_librispeech_v1_extractor/ivector_extractor/final.mat ark:- ark:- |' I have 30 dimensions in the MFCC features I created vs the features in librispeech model. Could it be because of that? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#4496 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO3ZNJOZHMKXAR3MA3DTI3DN7ANCNFSM4246HQOA> .

TPalawaT · 2021-04-18T16:39:11Z

@danpovey Thank you very much. It was indeed a problem of dimension mismatch. There were a few more errors in the script but it did work in the end.
Thank you very much again!

TPalawaT closed this as completed Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Error when using pre trained embedddings to extract ivector #4496

Getting Error when using pre trained embedddings to extract ivector #4496

TPalawaT commented Apr 14, 2021

danpovey commented Apr 14, 2021 via email

TPalawaT commented Apr 15, 2021

danpovey commented Apr 15, 2021 via email

TPalawaT commented Apr 18, 2021 •

edited

Loading

Getting Error when using pre trained embedddings to extract ivector #4496

Getting Error when using pre trained embedddings to extract ivector #4496

Comments

TPalawaT commented Apr 14, 2021

danpovey commented Apr 14, 2021 via email

TPalawaT commented Apr 15, 2021

danpovey commented Apr 15, 2021 via email

TPalawaT commented Apr 18, 2021 • edited Loading

TPalawaT commented Apr 18, 2021 •

edited

Loading