Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmetic and other minor fixes for segmentation #1784

Merged
merged 70 commits into from Aug 3, 2017

Conversation

vimalmanohar
Copy link
Contributor

No description provided.

Copy link
Contributor

@danpovey danpovey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're making progress. I have a lot of comments but they are all very minor.

@@ -0,0 +1,322 @@
#! /usr/bin/env python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the kind of script that should go in utils/, probably in utils/ctm/. Please create a directory utils/ctm/, put this there, put fix_ctm.sh and convert_ctm.pl there, and make soft links from their old locations.

head -n 10 $dir/utt2dur | paste - $temp | \
awk '{ dur += $2; frames += $4; } END { shift = dur / frames; if (shift > 0.0098 && shift < 0.0102) shift = 0.01; print shift; }' || exit 1;
frame_shift=$(cat $dir/frame_shift)
if [ -z "$frame_shift" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing silently here is a problem because not all invocations of this script check the return status: see steps/nnet3/chain/get_egs.sh. Probably best to fix wsj/s5/steps/nnet3/chain/get_egs.sh to check the status, but also add an error message here.

# Apache 2.0

if [ $# -ne 1 ]; then
echo "This script outputs a mapping from recording to a list of utterances "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script doesn't have a very good, or well-described, interface. It seems that in addition to doing what it says it does, it also creates a segments file if one did not already exist. I don't see why that behavior would be desired; and I don't even see why it's necessary, because you could easily do it in a pipe.
I don't recommend to have it create the segments file, because for a script whose primary purpose is to output to stdout, to have a side effect, is very confusing.

a new wav.scp file.
If --reco2vol is provided, then for each recording, the volume factor
specified in that file is applied.
Otherwise, a volume factor is chosen randomly between
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

randomly -> from a uniform distribution

if [ $# -ne 2 ]; then
echo "This script adds a sox line in wav.scp to resample the audio at a "
echo "different sampling-rate"
echo "Usage: $0 <frequency> <data-dir>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need example.

@@ -36,7 +36,8 @@ class NnetComputerFromEg {

// Compute the output (which will have the same number of rows as the number
// of Indexes in the output of the eg), and put it in "output".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is wrong now that you have changed the code.

"in compressed format (recommended). Update: this is now "
"only relevant if the features being read are un-compressed; "
"if already compressed, we keep we same compressed format when "
"dumping-egs.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove '-'

po.Register("compress", &compress, "If true, write egs in "
"compressed format.");
po.Register("compress", &compress, "If true, write egs with input features "
"in compressed format (recommended). Update: this is now "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace 'Update: this is now only' with 'This is only'. We don't document the changes to options, except if likely to break things, because the average user won't be aware of, and won't care about, the history.

@@ -197,6 +197,11 @@ int main(int argc, char *argv[]) {
"--online-ivectors option");
po.Register("length-tolerance", &length_tolerance, "Tolerance for "
"difference in num-frames between feat and ivector matrices");
po.Register("targets-length-tolerance", &targets_length_tolerance,
"Tolerance for "
"difference in num-frames after subsampling between "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parenthesize (after subsampling)

po.Register("targets-length-tolerance", &targets_length_tolerance,
"Tolerance for "
"difference in num-frames after subsampling between "
"feat and target matrices");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feat -> feature

Copy link
Contributor

@danpovey danpovey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very small comments.

awk '{ dur += $2; frames += $4; } END { shift = dur / frames; if (shift > 0.0098 && shift < 0.0102) shift = 0.01; print shift; }' || exit 1;
frame_shift=$(cat $dir/frame_shift)
if [ -z "$frame_shift" ]; then
echo "$0: Could not read get frame shift from directory $dir"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to be redirecting this to stderr.

@@ -7,6 +7,17 @@
# the wav.scp to perturb the volume (typically useful for training data when
# using systems that don't have cepstral mean normalization).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much testing have you done with this? Can you please test it with directories that have pipes and those that have raw wav files, and make sure it doesn't break in a very obvious way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested for various cases. Also addressed all other comments.


/// This constructor creates NnetIo with name "name", indexes with n=0, x=0,
/// and t values ranging from t_begin to t_begin + feats.NumRows() - 1, and
/// and t values ranging from t_begin to t_begin + t_stride * feats.NumRows() - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need parenthesis in excpression (also below), and watch line length.

// of Indexes in the output of the eg), and put it in "output".
void Compute(const NnetExample &eg, Matrix<BaseFloat> *output) {
// of Indexes in the output with the name 'output_name' of the eg),
// and put it in "output".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a pre-existing problem, but please change "output" to '*output'.


ParseOptions po(usage);
po.Register("binary", &binary_write, "Write output in binary mode");
po.Register("apply-exp", &apply_exp, "If true, apply exp function to "
"output");
po.Register("output-name", &output_name, "Do computation for "
"specified output");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output -> output-node.

@@ -179,10 +177,10 @@ int main(int argc, char *argv[]) {
ParseOptions po(usage);

po.Register("compress", &compress, "If true, write egs with input features "
"in compressed format (recommended). Update: this is now "
"in compressed format (recommended). Update: this is "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove 'Update: '

@vimalmanohar
Copy link
Contributor Author

If there are no more changes, I can rebase and create a cleaner PR.

@danpovey
Copy link
Contributor

danpovey commented Aug 3, 2017

I forgot about this. It's OK, I'm going to squash and merge.

@danpovey danpovey merged commit 35faa4d into kaldi-asr:master Aug 3, 2017
kronos-cm added a commit to kronos-cm/kaldi that referenced this pull request Aug 5, 2017
* 'master' of https://github.com/kaldi-asr/kaldi:
  [egs] Update tedlium and aspire recipes to use xconfigs (avoid crash) (kaldi-asr#1790)
  [egs] fix bugs in Multi-database English LVCSR recipe (kaldi-asr#1785)
  [src,egs,scripts] Cosmetic and other minor fixes, some required for segmentation PR (kaldi-asr#1784)
  [egs] update multi_condition script in swbd (kaldi-asr#1788)
  [src] Circumvent Visual Studio 2017 bug regarding name resolution (kaldi-asr#1783)
  [tools] adding phonetisaurus install scripts (PR#1734)
  [scripts] update nnet3 scripts to fix bug where rejecting 'bad' models was not happening (kaldi-asr#1777)
  [src] make dithering in feature processing more efficient by using random state, thanks: liximin244@gmail.com
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants