Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: in data_fbank/train_960_cleaned, recording-ids extracted from wav.scp and reco2dur file differ #13

Closed
sunnwmy opened this issue Aug 31, 2018 · 1 comment

Comments

@sunnwmy
Copy link

sunnwmy commented Aug 31, 2018

I have run all the procedures in run.sh for several days and finally got 'train_960_cleaned' for training the deep fsmn. But when I start training deep fsmn by running 'local/nnet/run_fsmn.sh DFSMN_S', it gives error:

`steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3_cleaned/ivectors_dev_other_hires using the extractor in exp/nnet3_cleaned/extractor.
steps/make_fbank.sh --nj 30 --cmd run.pl --fbank-config conf/fbank.conf data_fbank/train_960_cleaned exp/make_fbank/train_960_cleaned fbank/train_960_cleaned
steps/make_fbank.sh: moving data_fbank/train_960_cleaned/feats.scp to data_fbank/train_960_cleaned/.backup
utils/validate_data_dir.sh: Error: in data_fbank/train_960_cleaned, recording-ids extracted from wav.scp and reco2dur file
utils/validate_data_dir.sh: differ, partial diff is:
1,301545c1,281081
< 100-121669-0000-1
< 100-121669-0001-1
< 100-121669-0002-1
< 100-121669-0003-1
< 100-121669-0004-1
...

986-129388-0107
986-129388-0108
986-129388-0109
986-129388-0110
986-129388-0111
986-129388-0112
[Lengths are /tmp/kaldi.rudy/utts=301545 versus /tmp/kaldi.rudy/recordings.reco2dur=281081]`

It seems the number of records in file utts and file recordings.reco2dur is not the same, but validate_data_dir.sh expects them to be same. Does anyone know how to fix this? Any advice would be appreciated. Thanks!

@sunnwmy
Copy link
Author

sunnwmy commented Aug 31, 2018

problem solved, just update the 'validate_data_dir.sh'. the old version has bugs dealing with the reco2dur file

@sunnwmy sunnwmy closed this as completed Aug 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant