Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[egs][aishell][v1] wav name is not align with its path. #2700

Closed
vzxxbacq opened this issue Sep 13, 2018 · 3 comments
Closed

[egs][aishell][v1] wav name is not align with its path. #2700

vzxxbacq opened this issue Sep 13, 2018 · 3 comments

Comments

@vzxxbacq
Copy link
Contributor

Thanks everyone's great work and I had a trouble when I run aishell/v1 project.
In aishell/v1/local/aishell_data_prep.sh script, the author write this code:

# get all wavs' path
find $audio_dir -iname "*.wav" | grep -i "wav/train" > $train_dir/wav.flist || exit 1;

# After some steps.....
    # get utt.list
    sed -e 's/\.wav//' $dir/wav.flist | awk -F '/' '{print( $NF)}' | sort > $dir/utt.list
    # then few steps later, get wav.scp_all
    paste -d' ' $dir/utt.list $dir/wav.flist > $dir/wav.scp_all

The wav.flist isn't sorted and utt.list is sorted, so I got a scp that wav isn't align with its path.

In wav.scp_all:

....
BAC009S0002W0240 /media/fhq/common/1data//data_aishell/wav/train/S0002/BAC009S0002W0225.wav
BAC009S0002W0241 /media/fhq/common/1data//data_aishell/wav/train/S0002/BAC009S0002W0226.wav
BAC009S0002W0242 /media/fhq/common/1data//data_aishell/wav/train/S0002/BAC009S0002W0227.wav
BAC009S0002W0243 /media/fhq/common/1data//data_aishell/wav/train/S0002/BAC009S0002W0228.wav
.....

And after I remove the sort step, I got a aligned scp file.
Is it a bug or I missed something? Thanks.

@danpovey
Copy link
Contributor

@naxingyu can you please look at this?

@naxingyu
Copy link
Contributor

naxingyu commented Sep 14, 2018 via email

@vzxxbacq
Copy link
Contributor Author

issue fixed, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants