Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: fix for utt2dur when applied on speed-pertubed data #678

Merged
merged 4 commits into from
Apr 12, 2016

Conversation

freewym
Copy link
Contributor

@freewym freewym commented Apr 10, 2016

In order to fix the issue raised in #671

class WaveInfoHolder and class WaveHolder do not have relationship in terms of inheritance. So currently I did two separate instantiations of SequentialTableReader in the "if else " branches in featbin/wav-to-duration.cc, which duplicate some code. Please let me know if there are better solutions.

Yiming

@@ -68,7 +70,10 @@ else
echo "$0: wav-to-duration is not on your path"
exit 1;
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at this point, if read_entire_file is false, you should check whether the wav.scp has any of the offending sox commands that tend to create this problem, and if so, set it to true and print a message saying why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added code to check if there are any sox commands with keyword "speed"; if so, set read_entire_file=true.

Yiming

if ! wav-to-duration scp:$data/wav.scp ark,t:$data/utt2dur 2>&1 | grep -v 'nonzero return status'; then

read_entire_file=false
if [ `cat $data/wav.scp | sed -n '/sox.*speed/p' | wc -l` -gt 0 ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a grep or grep -E command with output piped to /dev/null would be more elegant-- grep returns true if it matched at least one line.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 11/04/16 21:16, Daniel Povey wrote:

In egs/wsj/s5/utils/data/get_utt2dur.sh
#678 (comment):

@@ -68,7 +68,14 @@ else
echo "$0: wav-to-duration is not on your path"
exit 1;
fi

  • if ! wav-to-duration scp:$data/wav.scp ark,t:$data/utt2dur 2>&1 | grep -v 'nonzero return status'; then
  • read_entire_file=false
  • if [ cat $data/wav.scp | sed -n '/sox.*speed/p' | wc -l -gt 0 ]; then

I think a grep or grep -E command with output piped to /dev/null would
be more elegant-- grep returns true if it matched at least one line.

instead of stuffing the output to /dev/null you might want to use -q

    -q, --quiet, --silent
           Quiet; do not write anything to standard output.  Exit 

immediately with zero status if any match is found, even if an error was
detected. Also see the -s or --no-messages option. (-q
is specified by POSIX.)

Tony

Speechmatics is a trading name of Cantab Research Limited
We are hiring: www.speechmatics.com/careers
https://www.speechmatics.com/careers
Dr A J Robinson, Founder, Cantab Research Ltd
Phone direct: 01223 794096, office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

@danpovey
Copy link
Contributor

Merging. Thanks!

@danpovey danpovey merged commit 4488d3c into kaldi-asr:master Apr 12, 2016
@freewym freewym deleted the dur_fix branch April 12, 2016 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants