Multilingual using modified configs #1290

pegahgh · 2016-12-27T17:22:05Z

This is a modified multilingual setup based on new xconfig and training scripts.
In this setup, xconfig used to create network configuration for multilingual training.
Also the egs generation moved out of training script and multilingual egs dir passed to train_raw_dnn.py.
Also a new script added for average posterior computation and prior adjustment.

…and modified some codes and binaries for multilingual setup.

…ilingual/get_egs.sh, which generates egs.*.scp for training multilingual system in more io efficient way

…in more efficiently.

…xconfig for config generation.

pegahgh · 2016-12-27T17:24:10Z

@danpovey
I added modified multilingual training to this branch. I wonder if someone can help me to tune the model.
I would be thankful if @vimalmanohar @jtrmal can do review for this branch.

danpovey

This combines some code comments with some much more important comments about the use of scps and disk access order.
Please remind me if we had any discussion about this, and whether we agreed on any particular approach. Does the prepare-multilingual-egs script actually re-dump the egs?

I think maybe what I was thinking is that egs from different languages would be merged and randomized in memory and split to multiple archives while maintaining fairly linear access to files. What you are doing seems to have a lot of random-access.

I'd rather get the high-level issues sorted out before talking about the code-level issues again.

danpovey · 2016-12-28T00:01:26Z

src/nnet3bin/nnet3-copy-egs.cc

@@ -301,7 +125,16 @@ int main(int argc, char *argv[]) {
                "feature left-context that we output.");
    po.Register("right-context", &right_context, "Can be used to truncate the "
                "feature right-context that we output.");
-
+    po.Register("weights", &weight_str,


Be more careful with line length.
Also, I really don't like the use of TableReader to read such a tiny file, it wasn't designed for that kind of use, and it could be confusing.
How about an option like
po.Register("scales", &scales_str,
"Can be used to scale the features on a subset of inputs or "
"outputs of the NnetExamples being copied. "
"e.g. --io-scales='output-1=0.4,output-2=1.2'.");

and you could have a function like this:

// Parses the scales for the --scales option, e.g. "output1=0.4,output2=0.9", and outputs
// it as a map. Dies with an error message if there was a problem doing this.
void ParseScales(const std::string scales_str, std::map<std::string, BaseFloat> *scales);

You will want to use the function SplitStringToVector two times in this function, with "," and with "=".

Please make a similar modification to the "outputs" option, like this:
po.Register("rename-io", &rename_io_str,
"This option can be used to rename a subset of inputs or "
"outputs of the NnetExamples being copied. For example: "
"--rename-io='input=input-1,output=output-1'. ");

cancel that previous comment, but can you please register these options with better strings?
E.g. specify that it's the Rspecifier of a table indexed by the key of the input examples.

danpovey · 2016-12-28T00:03:31Z

src/nnet3bin/nnet3-merge-egs.cc

 // or crashes if it is not there.
+// e.g. output-0, output-xent
 int32 NumOutputIndexes(const NnetExample &eg) {


Actually there is no need to crash if nothing named "output" is found; the
--measure-output-frames option is always false now and will be removed within
a couple of weeks. Just return 1 in that case.

danpovey · 2016-12-28T00:16:27Z

src/nnet3bin/nnet3-copy-egs.cc

@@ -332,14 +168,61 @@ int main(int argc, char *argv[]) {
        int32 index = (random ? Rand() : num_written) % num_outputs;
        if (frame_str == "" && left_context == -1 && right_context == -1 &&


To avoid too much duplication of code you can put the shared code that exists in both branches of the if-statement, in a function, or two functions.
Also if both of these options are not used, please avoid the redundant copy; you can have another if-statement, where each branch has Write() on a different variable.

danpovey · 2016-12-28T00:32:52Z

src/nnet3/nnet-example-utils.h

+
+    Also computes the minimum and maximum "t" values in the "input" and
+    "output" NnetIo members.
+**/


this function is too special-purpose to be put in a header, as it combines a couple of different functionalities in one.

danpovey · 2016-12-28T00:33:50Z

src/nnet3/nnet-example-utils.h

+   min_output_t or t > max_output_t.
+   Will crash if filtering removes all Indexes of "input" or "output".
+ */
+void FilterExample(const NnetExample &eg,


... I actually think it would be better to leave all these things in nnet3-copy-egs.cc for now... they are all of quite specialized use and they will bulk up the library for no advantage.

danpovey · 2016-12-28T00:49:22Z

egs/wsj/s5/steps/nnet3/get_egs.sh

@@ -39,7 +39,7 @@ num_utts_subset=300     # number of utterances in validation and training
 num_valid_frames_combine=0 # #valid frames for combination weights at the very end.
 num_train_frames_combine=10000 # # train frames for the above.
 num_frames_diagnostic=4000 # number of frames for "compute_prob" jobs
-samples_per_iter=400000 # this is the target number of egs in each archive of egs
+samples_per_iter=40000 # this is the target number of egs in each archive of egs


you shouldn't change defaults like this.

danpovey · 2016-12-28T00:50:00Z

egs/wsj/s5/steps/nnet3/get_egs.sh

@@ -56,6 +56,7 @@ online_ivector_dir=  # can be used if we are including speaker information as iV
 cmvn_opts=  # can be used for specifying CMVN options, if feature type is not lda (if lda,
            # it doesn't make sense to use different options than were used as input to the
            # LDA transform).  This is used to turn off CMVN in the online-nnet experiments.
+generate_egs_scp=false # If true, it will generate egs.JOB.*.scp per egs archive


er... did we discuss generating scps? This is rather dangerous from a disk-access-order perspective.

danpovey · 2016-12-28T00:54:36Z

egs/wsj/s5/steps/nnet3/multilingual/get_egs.sh

+if [ $stage -le 0 ]; then
+  echo "$0: allocating multilingual examples for training."
+  # Generate egs.*.scp for multilingual setup.
+  $cmd $megs_dir/log/allocate_multilingual_examples_train.log \


I didn't realize that you were doing this using scps... not sure if we discussed this...
If the training jobs have to effectively do random access, it will be terribly slow and
thrash the disk like crazy (or at least use up too much the memory on the nfs server).
Even if you had them sorted nicely so it was just skipping between different archive
files but accessing the individual files mostly in sequential order, the table-reading
code really isn't optimized for this case; it would opening and closing files like
crazy, and who knows what that will do to nfs caching.

I think it would be better to arrange somehow for this script to merge and maybe shuffle multiple archives of egs in memory and split them to different outputs via nnet3-copy-egs, to create the final archives.
Any renaming and scales could be applied via language-dependent options to language-dependent instances of nnet3-copy-egs (maybe inside pipes in rspecifiers), which would avoid the rather ugly dependence on the key to decide the renaming options and the scale options in nnet3-copy-egs.

@vimalmanohar knows about this stuff... I'm surprised if he would have signed off on this approach.

danpovey · 2016-12-28T00:58:31Z

src/matrix/matrix-lib-test.cc

+        scaled_mat.Scale(alpha);
+        cmat.Scale(alpha);
+        cmat.CopyToMat(&scaled_comp_mat);
+        scaled_comp_mat.ApproxEqual(scaled_mat, 1.0e-04);


ApproxEqual returns a bool and this would not check anything. You need to check the value.

danpovey · 2016-12-28T01:00:43Z

src/nnet3/nnet-diagnostics.h

@@ -97,6 +97,11 @@ class NnetComputeProb {
  // or NULL if there is no such info.
  const SimpleObjectiveInfo *GetObjective(const std::string &output_name) const;

+  // return objective info for all outputs


I think it would be nicer to return a string that would summarize the objectives for different outputs, in a way that could be used more easily by the calling code (and would look nice in the normal case where there was just one output).

danpovey · 2016-12-28T01:27:05Z

Hm, I see now in the allocate-egs code that you try hard to keep things linear.
A solution to the problem of constantly opening and closing files, is that instead of individually allocating egs from particular source files you could allocate them in fairly large blocks-- say, blocks of 50 egs.

Of course the scp files could be quite large in general, they may be 1/10th the size of the actual egs in some setups, or more depending on the path lengths. But this is probably OK, it's not even a common use-case, and 10% is manageable.

I hope your python script is smart about not reading more stuff into memory than it needs.

danpovey · 2016-12-28T01:29:11Z

... After thinking about it, I'm probably OK with the basic pattern of allocating the per-example weights and the output-names-- I see that it's hard to do it efficiently any other way in this scenario, but I think it needs (a) much better and clearer documentation, and (b) better and clearer code.

pegahgh · 2016-12-28T01:30:53Z

I agree that it need much clearer documentation, I will add more detailed documentation for that.

…

On Tue, Dec 27, 2016 at 8:29 PM, Daniel Povey ***@***.***> wrote: ... After thinking about it, I'm probably OK with the basic pattern of allocating the per-example weights and the output-names-- I see that it's hard to do it efficiently any other way in this scenario, but I think it needs (a) much better and clearer documentation, and (b) better and clearer code. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1290 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AI-PtPJiDEPIgKvOaxpzm_o_r6vW23gzks5rMbtsgaJpZM4LWZyR> .

danpovey · 2017-02-01T21:38:48Z

has conflicts! Please resolve

…j info.

…ahgh/kaldi into multilingual-modified-config

…/splits.

danpovey · 2017-02-02T04:45:48Z

src/nnet3/nnet-training.cc

@@ -222,11 +224,10 @@ void NnetTrainer::PrintMaxChangeStats() const {
 void ObjectiveFunctionInfo::UpdateStats(
    const std::string &output_name,
    int32 minibatches_per_phase,
-    int32 minibatch_counter,


I think you should revert this change and do it another way if possible. It's too likely to cause problems elsewhere (e.g. merge conflicts with adversarial training). When you call it with num_minibatches_processed_++, can't you just delay incrementing num_minibatches_processed_ until later on?

I think the previous method was wrong, since it assumes single output in egs. The phase should be computed w.r.t num_minibatches per output not total num_minibatches.

danpovey · 2017-02-02T04:46:09Z

src/nnet3/nnet-training.cc

  for (; iter != end; ++iter) {
    const NnetIo &io = *iter;
+    io_count++;


it looks like this variable is never read.

danpovey · 2017-02-02T04:50:41Z

egs/wsj/s5/steps/nnet3/multilingual/allocate_multilingual_examples.py

+from __future__ import print_function
+import re, os, argparse, sys, math, warnings, random, io, imp
+import numpy as np
+


this file should not have been included in your patch, I think something went wrong resolving the merge.

danpovey · 2017-02-02T04:55:41Z

egs/wsj/s5/steps/nnet3/chain/train.py

+        # The last dir corresponds to multilingual egs directory,
+        # that is generated using this script, but 
+        # it requires single egs dirs to exist.
+        multi_egs_dir = args.egs_dir.split()


I don't really like this script having to deal with the multilingual egs. I was thinking of moving the egs-generation out of the train.py scripts anyway, as it's kind of unrelated to the rest of what they do.
Can't we just insist that in the multilingual case, the egs must already have been generated, and the dir must be supplied using the common-egs-dir option?

Sure, I think it is better idea! I planned to do that!

It seems I already generate egs outside train_raw.py for multilingual recipe local/nnet3/run_tdnn_multilingual.sh (I should revert these changes!) and pass it ad egs-dir to train.py.

danpovey · 2017-02-02T04:57:39Z

egs/wsj/s5/steps/nnet3/compute_and_adjust_priors.py

+                        dest='use_multitask_egs',
+                        default=False, choices=["true", "false"],
+                        help='If true, it is assumed there are different examples '
+                          'correspond to multiple tasks in output layer.')


Firstly, there is a merge problem here. This code has been refactored and now lives elsewhere, in steps/libs/. Vimal can help you.
But this help message is not great.
Actually I think a better thing would be to get rid of this option altogether, to insist that the egs be supplied externally, and to have the script itself work out somehow whether the egs directory supplied is of the "multilingual" type, or multitask type (let's have a consistent name at the script level, either would be OK).

Ok, I can add a file num_langs in multilingual egs dir! Is it good?

The idea is that, we compute average posterior and adjust the prior inside the train_*.py script, but in multilingual setting, we need to compute and adjust it per output, so I moved this computation outside train_raw_dnn.py!
I think it shouldn't be really a part of training script.
@vimalmanohar do you have any suggestion??

the comment is not about multitask option, it is about adding separate python script egs/wsj/s5/steps/nnet3/compute_and_adjust_priors.py

danpovey · 2017-02-02T19:36:00Z

OK.

…

On Thu, Feb 2, 2017 at 2:05 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/nnet3/compute_and_adjust_priors.py <#1290>: > + dest='readjusted_model', + default="final", + help="the model name used to store readjusted model.") + parser.add_argument("--post-process.readjust-priors", type=str, + action=common_lib.StrToBoolAction, + dest='readjust_priors', + default=True, choices=["true", "false"], + help='If true, the prior re-adjusted using computed ' + 'averarage posterior.') + # extra egs option + parser.add_argument("--egs.use-multitask-egs", type=str, + action=common_lib.StrToBoolAction, + dest='use_multitask_egs', + default=False, choices=["true", "false"], + help='If true, it is assumed there are different examples ' + 'correspond to multiple tasks in output layer.') Ok, I can add a file num_langs in multilingual egs dir! Is it good? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu3SN5FDtgdH77OfZMzssGTIouNX6ks5rYijhgaJpZM4LWZyR> .

danpovey · 2017-02-02T20:58:14Z

The phase is now just an arbitrary division of the data into segments for reporting purposes, but we still want them to stay in sync, e.g. every 10 minibatches we process, we call that one "phase". We don't want different divisions of the data for different outputs of the nnet. This issue occurred also in the adversarial training. We *do* want the minibatch counter to be stored outside the stats object, because that way it can be kept in sync with the "real" minibatches. It's OK if for some "phases", a particular output was seen zero times.

…

On Thu, Feb 2, 2017 at 3:53 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/nnet3/nnet-training.cc <#1290>: > @@ -222,11 +224,10 @@ void NnetTrainer::PrintMaxChangeStats() const { void ObjectiveFunctionInfo::UpdateStats( const std::string &output_name, int32 minibatches_per_phase, - int32 minibatch_counter, I think the previous method was wrong, since it assumes single output in egs. The phase should be computed w.r.t num_minibatches per output not total num_minibatches. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu7GQHW7bxj6wGWUx9o40nlpclQ3Cks5rYkI7gaJpZM4LWZyR> .

danpovey · 2017-02-02T21:27:29Z

Can you go through the PR again and check for issues like this?

…

On Thu, Feb 2, 2017 at 4:25 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/nnet3/chain/train.py <#1290>: > @@ -314,30 +314,49 @@ def train(args, run_opts, background_process_handler): default_egs_dir = '{0}/egs'.format(args.dir) if (args.stage <= -3) and args.egs_dir is None: logger.info("Generating egs") + # If num of egs dirs in args.egs_dir > 1, + # it is correspond to multilingual training and + # it should generate egs dir for each languages. + # The last dir corresponds to multilingual egs directory, + # that is generated using this script, but + # it requires single egs dirs to exist. + multi_egs_dir = args.egs_dir.split() It seems I already generate egs outside train_raw.py for multilingual recipe local/nnet3/run_tdnn_multilingual.sh (I should revert these changes!) and pass it ad egs-dir to train.py. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu2cyA5WFw-UDW7aykeuk_dzOqPVJks5rYkm8gaJpZM4LWZyR> .

danpovey · 2017-02-02T21:41:00Z

Can you please clarify this, Pegah? I don't quite get your point. Explain how it relates to the options that you are adding.

…

On Thu, Feb 2, 2017 at 4:33 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/nnet3/compute_and_adjust_priors.py <#1290>: > + dest='readjusted_model', + default="final", + help="the model name used to store readjusted model.") + parser.add_argument("--post-process.readjust-priors", type=str, + action=common_lib.StrToBoolAction, + dest='readjust_priors', + default=True, choices=["true", "false"], + help='If true, the prior re-adjusted using computed ' + 'averarage posterior.') + # extra egs option + parser.add_argument("--egs.use-multitask-egs", type=str, + action=common_lib.StrToBoolAction, + dest='use_multitask_egs', + default=False, choices=["true", "false"], + help='If true, it is assumed there are different examples ' + 'correspond to multiple tasks in output layer.') The idea is that, it computes average posterior and adjust the prior inside the function, but we need to compute and adjust it per output, so I moved this computation outside train_raw_dnn.py! I think it shouldn't be really a part of training script. @vimalmanohar <https://github.com/vimalmanohar> do you have any suggestion?? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu-snRewuPcGW1RPiInlRLRBtRt50ks5rYkvGgaJpZM4LWZyR> .

danpovey · 2017-02-02T21:54:37Z

I'm OK for you to add such a script, if it's done carefully, but I don't want you to change how the existing nnet training scripts work because when we would have to test.

…

On Thu, Feb 2, 2017 at 4:50 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/nnet3/compute_and_adjust_priors.py <#1290>: > + dest='readjusted_model', + default="final", + help="the model name used to store readjusted model.") + parser.add_argument("--post-process.readjust-priors", type=str, + action=common_lib.StrToBoolAction, + dest='readjust_priors', + default=True, choices=["true", "false"], + help='If true, the prior re-adjusted using computed ' + 'averarage posterior.') + # extra egs option + parser.add_argument("--egs.use-multitask-egs", type=str, + action=common_lib.StrToBoolAction, + dest='use_multitask_egs', + default=False, choices=["true", "false"], + help='If true, it is assumed there are different examples ' + 'correspond to multiple tasks in output layer.') the comment is not about multitask option, it is about adding separate python script egs/wsj/s5/steps/nnet3/compute_and_adjust_priors.py — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu75zhAiwyRsIE2kz1NKu1xXpA69dks5rYk-6gaJpZM4LWZyR> .

…ilingual-modified-config

jtrmal · 2017-05-03T21:51:03Z

egs/babel_multilang/s5/local/nnet3/run_shared_ivector_extractor.sh

+. ./cmd.sh
+set -e
+stage=4
+use_flp=false


please delete use_flp (we don't use it anymore). Well, at least we should not.

jtrmal · 2017-05-03T23:45:31Z

egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh

+  multi_data_dirs[$lang_index]=data/${lang_list[$lang_index]}/train${suffix}${feat_suffix}
+  multi_egs_dirs[$lang_index]=exp/${lang_list[$lang_index]}/nnet3/egs${ivector_suffix}
+  multi_ali_dirs[$lang_index]=exp/${lang_list[$lang_index]}/${alidir}${suffix}
+  multi_ivector_dirs[$lang_index]=exp/${lang_list[$lang_index]}/nnet3/ivectors_train${suffix}${ivector_suffix}


this should probably be

multi_ivector_dirs[$lang_index]=exp/${lang_list[$lang_index]}/nnet3/ivectors_train${suffix}${feat_suffix}${ivector_suffix}

jtrmal · 2017-05-03T23:54:43Z

I'm getting error

./local/nnet3/run_tdnn_multilingual.sh: creating multilingual neural net configs using the xconfig parser
steps/nnet3/xconfig_to_configs.py --xconfig-file exp/nnet3/multi_bnf_sp/configs/network.xconfig --config-dir exp/nnet3/multi_bnf_sp/configs/ --nnet-edits=rename-node old-name=output-0 new-name=output
Traceback (most recent call last):
  File "steps/nnet3/xconfig_to_configs.py", line 309, in <module>
    main()
  File "steps/nnet3/xconfig_to_configs.py", line 304, in main
    check_model_contexts(args.config_dir, args.nnet_edits)
  File "steps/nnet3/xconfig_to_configs.py", line 287, in check_model_contexts
    if ((contexts['init']['left-context'] > contexts['ref']['left-context'])
KeyError: 'left-context'

when I run the script second time, the error disappears. After deleting the directory exp/nnet3/multi_bnf_sp/configs/ and running the script again, error resurfaces.

jtrmal · 2017-05-04T18:56:40Z

yes, but I didn't do any comparison of performance. but if you look at the egs, the online script is the one used when the recipe uses pitch+online decoding, AFAIK y.

…

On Thu, May 4, 2017 at 2:49 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/babel_multilang/s5/local/nnet3/run_common_langs.sh <#1290 (comment)>: > + if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $mfccdir/storage ]; then + date=$(date +'%m_%d_%H_%M') + utils/create_split_dir.pl /export/b0{1,2,3,4}/$USER/kaldi-data/egs/$lang-$date/s5c/$mfccdir/storage $mfccdir/storage + fi + + + for dataset in $train_set ; do + data_dir=data/$lang/${dataset}${feat_suffix} + log_dir=exp/$lang/make${feat_suffix}/$dataset + + utils/copy_data_dir.sh data/$lang/$dataset ${data_dir} || exit 1; + + # scale the waveforms, this is useful as we don't use CMVN + utils/data/perturb_data_dir_volume.sh $data_dir || exit 1; + + steps/make_mfcc${mfcc_affix}.sh --nj 70 $hires_config \ @jtrmal <https://github.com/jtrmal> interesting, did you try this on any setup? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKisX0xEohZC8LlFcbxlQeFcZsdaVdvkks5r2h2tgaJpZM4LWZyR> .

…ahgh/kaldi into multilingual-modified-config

jtrmal · 2017-05-05T13:14:09Z

egs/babel_multilang/s5/run-4-anydecode-langs.sh

+    fi
+
+    mfccdir=mfcc_hires/$lang
+    steps/make_mfcc.sh --nj $my_nj --mfcc-config conf/mfcc_hires.conf \


this should be make_mfcc or make_mfcc_pitch_online depending on the use_pitch variable

jtrmal · 2017-05-09T14:36:23Z

Pegah, this is still an issue:

steps/nnet3/xconfig_to_configs.py --xconfig-file exp/nnet3/multi_bnf_sp/configs/network.xconfig --config-dir exp/nnet3/multi_bnf_sp/configs/ --nnet-edits=rename-node old-name=output
-0 new-name=output
Traceback (most recent call last):
  File "steps/nnet3/xconfig_to_configs.py", line 309, in <module>
    main()
  File "steps/nnet3/xconfig_to_configs.py", line 304, in main
    check_model_contexts(args.config_dir, args.nnet_edits)
  File "steps/nnet3/xconfig_to_configs.py", line 287, in check_model_contexts
    if ((contexts['init']['left-context'] > contexts['ref']['left-context'])
KeyError: 'left-context'

when running the same thing again, the error disappears.

pegahgh · 2017-05-10T21:51:42Z

@jtrmal I checked xconfig and the problem fixed. I also fixed other issues e.g. decoding script and rerun the experiment.

danpovey

These comments are small, can you please address them right away? Vimal says that his speech-silence classification stuff depends on this.
I notice that you have removed
egs/hkust/s5/local/ext/195k_chinese_word2char_map
I think this may be unintentional (probably that file shouldn't be in the repo, but it's a separate issue).

There will be a bunch of merge conflicts when we merge to kaldi_52 but it might be easiest for me to resolve them myself.

danpovey · 2017-05-15T21:25:56Z

src/nnet3/nnet-diagnostics.cc

@@ -235,5 +235,16 @@ const SimpleObjectiveInfo* NnetComputeProb::GetObjective(
    return NULL;
 }

+double NnetComputeProb::GetTotalObjective(double *tot_weight) const {
+  double tot_objectives = 0.0;
+  unordered_map<std::string, SimpleObjectiveInfo, StringHasher>::const_iterator


you should set *tot_weight = 0 at this point.

danpovey · 2017-05-15T21:28:48Z

src/nnet3/nnet-diagnostics.h

  // returns the objective-function info for this output name (e.g. "output"),
  // or NULL if there is no such info.
  const SimpleObjectiveInfo *GetObjective(const std::string &output_name) const;

+  // It returns the objf sum, and it outputs the corresponding total weight
+  // to '*tot_weight'.
+  // NnetCombiner::ComputeObjfAndDerivFromNnet() would use this function instead.


The sentence "NnetCombiner::ComputeObjfAndDerivFromNnet() would use this function instead" is not good documentation: (1) instead of what? (2) this documents the change, that's what git commit comments are for; it should document what the function does currently.

Change the documentation to:
This function returns the total objective over all output nodes recorded here, and
outputs to 'tot_weight' the total weight (typically the number of frames) corresponding to it;
you would divide the return value by 'tot_weight' to get a normalized weight that
would be easily interpretable by a human.

danpovey · 2017-05-15T21:29:59Z

src/matrix/sparse-matrix.h

@@ -283,6 +289,9 @@ class GeneralMatrix {
  /// Implemented in ../cudamatrix/cu-sparse-matrix.cc
  void AddToMat(BaseFloat alpha, CuMatrixBase<BaseFloat> *cu_mat,
                MatrixTransposeType trans = kNoTrans) const;
+
+  /// scale each element of matrix with a scalar value.


change 'with a scalar value' -> 'by alpha'. scale->Scale.

danpovey · 2017-05-15T22:06:30Z

src/featbin/compute-kaldi-pitch-feats.cc

@@ -37,8 +37,8 @@ int main(int argc, char *argv[]) {
        "compute-kaldi-pitch-feats --sample-frequency=8000 scp:wav.scp ark:- \n"
        "\n"
        "See also: process-kaldi-pitch-feats, compute-and-process-kaldi-pitch-feats\n";
-


there are blank spaces added at the ends of lines in this file and in others. Can you please figure out how to automatically get rid of them? You should change your editor settings too.

My editor shows blank spaces at the end of line and I removed the blank spaces automatically from lots of files.

danpovey · 2017-05-15T22:11:48Z

egs/wsj/s5/steps/nnet3/train_raw_dnn.py

@@ -287,6 +287,20 @@ def train(args, run_opts, background_process_handler):
        num_hidden_layers, num_archives_expanded,
        args.max_models_combine, args.add_layers_period,
        args.num_jobs_final)
+    use_multitask_egs = False


you can remove the line: use_multitask_egs = False

danpovey · 2017-05-15T22:14:34Z

egs/wsj/s5/steps/libs/nnet3/train/common.py

+                           use_multitask_egs=False):
+    """ Generates egs option for multitask(or multilingual) training setup,
+        if {egs_prefix}output.*.ark or {egs_prefix}weight.*.ark files exists in egs_dir.
+        Each example line in {egs_prefix}*.scp has corresponding line containing


example line -> line
has -> has a

pegahgh · 2017-05-15T22:28:28Z

I will fix them right now.

jtrmal · 2017-05-16T17:23:55Z

BTW, Pegah, you still train the ivector extractor on the whole MFCC+pitch features when pitch=true

pegahgh · 2017-05-16T17:36:31Z

@jtrmal yes Yenda, I will do that when pitch is true, and it wouldn't degrade the performance. Did you get any degradation?

danpovey · 2017-05-16T17:37:51Z

It's also about compatibility with online decoding-- please don't use the pitch for extracting the iVectors.

…

On Tue, May 16, 2017 at 1:36 PM, pegahgh ***@***.***> wrote: @jtrmal <https://github.com/jtrmal> yes Yenda, I will do that when pitch is true, and it wouldn't degrade the performance. Did you get any degradation? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu6qEchUoUtayuNeVUsEpxqoGexI-ks5r6d6ngaJpZM4LWZyR> .

pegahgh · 2017-05-16T17:52:52Z

@danpovey I will remove it if you want, but It can improve the setup for some languages! We use online pitch extraction to extract pitch, and there should not be compatibility issue!

danpovey · 2017-05-16T17:54:44Z

The issue is that in the online2 decoding setup, it's hardcoded that the pitch is not included in the ivector extraction. By "it can improve the setup for some languages", do you mean that you have compared including vs. not including the pitch in the ivector input, and for some languages including the pitch was better? Was it worse for some languages? Dan

…

On Tue, May 16, 2017 at 1:52 PM, pegahgh ***@***.***> wrote: @danpovey <https://github.com/danpovey> I will remove it if you want, but It can improve the setup for some languages! We use online pitch extraction to extract pitch, and there should not be compatibility issue! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1290 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVuyYj2TUmUHNQiEMPtPrMDSSJCne1ks5r6eJ5gaJpZM4LWZyR> .

pegahgh · 2017-05-16T17:59:26Z

@danpovey , My result is based on multilingual system with two languages(Asamese and Cantonese), but I can try it on setup with more languages.
Should I remove pitch for ivector extraction or do more experiments before removing that?

…ilingual-modified-config

danpovey · 2017-05-16T23:18:30Z

src/matrix/sparse-matrix.h

-  
-  /// scale each element of matrix with a scalar value.
+
+  /// Scale each element of matrix by apla.


you introduced a typo here.

danpovey · 2017-05-16T23:34:15Z

src/nnet3/nnet-diagnostics.h

+  // This function returns the total objective over all output nodes recorded here, and
+  // outputs to 'tot_weight' the total weight (typically the number of frames)
+  // corresponding to it.
+  // NnetCombiner::ComputeObjfAndDerivFromNnet() would use this function instead.


As per my previous comment --please remove this line of comment.

danpovey

Some small comments, but please address them right away along with my previous two comments.

danpovey · 2017-05-17T17:14:03Z

egs/wsj/s5/steps/select_feats.sh

@@ -22,7 +22,7 @@ echo "$0 $@"  # Print the command line for logging
 if [ -f path.sh ]; then . ./path.sh; fi
 . parse_options.sh || exit 1;

-if [ $# -ne 5 ]; then
+if [ $# -lt 3 ] || [ $# -gt 5 ]; then
   echo "usage: $0 [options] <selector> <src-data-dir>  <dest-data-dir> <log-dir> <path-to-storage-dir>";


You have changed the usage but you have not adjusted the usage message appropriately.

danpovey · 2017-05-17T17:16:39Z

src/nnet3/nnet-combine.cc

-  if (objf_info == NULL)
+  double tot_weights,
+    tot_objf = prob_computer_->GetTotalObjective(&tot_weights);
+  if (tot_objf == 0.0)


This if-statement and the associated error message its prints are not suitable, this should not be an error. Please remove both

danpovey · 2017-05-17T17:18:15Z

src/nnet3/nnet-utils.cc

@@ -57,10 +57,9 @@ bool IsSimpleNnet(const Nnet &nnet) {
  // "input" and everything checks out.
  if (NumInputNodes(nnet) == 1)
    return true;
-  // Otherwise, there should be 2 inputs and one
+  // Otherwise, there should be input node with name input and one


the second "input" should be in double quotes.

danpovey · 2017-05-17T20:19:10Z

Thanks! Merged.

pegahgh added 9 commits September 14, 2016 19:00

added babel_multilang example dir for multilingual setting and added …

8cd4601

…and modified some codes and binaries for multilingual setup.

small change to run_common_langs.sh.

f39c872

fixed lots of issues w.r.t github comments and added steps/nnet3/mult…

8652655

…ilingual/get_egs.sh, which generates egs.*.scp for training multilingual system in more io efficient way

added a new multilingual example scripts and codes, which handles io …

e1b0c3d

…in more efficiently.

merged with recent kaldi/master

bcc5353

modified all multilingual script using new training scripts and uses …

0ec30bc

…xconfig for config generation.

fixed small issues.

18fe41c

fixed small issues.

2ef4ccf

removed changes applied to kaldi-math.

8f543d4

danpovey reviewed Dec 28, 2016

View reviewed changes

pegahgh added 5 commits February 1, 2017 17:07

small changes to fix bottleneck feature extraction.

b70ea39

Merge branch 'master' into multilingual-modified-config

3485481

modified GetAllObjectiveInfo to return string that would summerize ob…

fc9b95f

…j info.

Merge branch 'multilingual-modified-config' of https://github.com/peg…

1998a84

…ahgh/kaldi into multilingual-modified-config

added back deleted splits dir in egs/fisher_callhome_spanish/s5/local…

7203f77

…/splits.

danpovey reviewed Feb 2, 2017

View reviewed changes

pegahgh added 3 commits February 2, 2017 18:11

fixed some issues mentioned in comments.

f067290

Merge branch 'master' of https://github.com/kaldi-asr/kaldi into mult…

bc32cb4

…ilingual-modified-config

modified comments and parameter descriptions in code.

a419dbf

jtrmal reviewed May 3, 2017

View reviewed changes

Merge branch 'master' into multilingual-modified-config

c9d8a4c

pegahgh added 2 commits May 4, 2017 16:39

fixed more bugs e.g. bug with xconfig.

57c5a46

Merge branch 'multilingual-modified-config' of https://github.com/peg…

b55d36c

…ahgh/kaldi into multilingual-modified-config

jtrmal reviewed May 5, 2017

View reviewed changes

fixed compatibility issues e.g. run-4-anydecode-lang.sh

a33cb48

danpovey reviewed May 15, 2017

View reviewed changes

pegahgh added 4 commits May 16, 2017 14:26

fixed some issues.

7c7aa0a

Merge branch 'master' of https://github.com/kaldi-asr/kaldi into mult…

a46a294

…ilingual-modified-config

added use_pitch_ivector option to scripts.

c19206c

fixed issues.

afbb854

danpovey reviewed May 16, 2017

View reviewed changes

danpovey reviewed May 17, 2017

View reviewed changes

pegahgh added 2 commits May 17, 2017 14:34

fixed small issues.

02e7f93

fixed small typo.

c8053ea

danpovey merged commit 7af2128 into kaldi-asr:master May 17, 2017

		@@ -332,14 +168,61 @@ int main(int argc, char *argv[]) {
		int32 index = (random ? Rand() : num_written) % num_outputs;
		if (frame_str == "" && left_context == -1 && right_context == -1 &&


		/// scale each element of matrix with a scalar value.

		/// Scale each element of matrix by apla.

Multilingual using modified configs #1290

Multilingual using modified configs #1290

Conversation

pegahgh commented Dec 27, 2016

pegahgh commented Dec 27, 2016

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Dec 28, 2016

danpovey commented Dec 28, 2016

pegahgh commented Dec 28, 2016 via email

danpovey commented Feb 1, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegahgh Feb 2, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Feb 2, 2017 via email

danpovey commented Feb 2, 2017 via email

danpovey commented Feb 2, 2017 via email

danpovey commented Feb 2, 2017 via email

danpovey commented Feb 2, 2017 via email

jtrmal May 3, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtrmal commented May 3, 2017

jtrmal commented May 4, 2017 via email

Choose a reason for hiding this comment

jtrmal commented May 9, 2017

pegahgh commented May 10, 2017

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegahgh commented May 15, 2017

jtrmal commented May 16, 2017

pegahgh commented May 16, 2017

danpovey commented May 16, 2017 via email

pegahgh commented May 16, 2017

danpovey commented May 16, 2017 via email

pegahgh commented May 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented May 17, 2017

pegahgh Feb 2, 2017 •

edited

jtrmal May 3, 2017 •

edited