Transfer learning wsj->rm #1633

pegahgh · 2017-05-18T23:24:12Z

This is the Transfer learning setup which transfers the model trained on wsj to rm.
The preliminary result without any tuning shows improvement(2.71 -> 2.21)

…sfer-learning-wsj-rm Conflicts: egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

danpovey · 2017-05-18T23:32:45Z

Thanks-- I'll try to look at this tonight.

…

On Thu, May 18, 2017 at 7:24 PM, pegahgh ***@***.***> wrote: This is the Transfer learning setup which transfers the model trained on wsj to rm. The preliminary result without any tuning shows improvement(2.71 -> 2.21) ------------------------------ You can view, comment on, or merge this pull request online at: #1633 Commit Summary - added Transfer learning setup using nnet3+chain+tdnn for WSJ->RM. - Merge branch 'master' of https://github.com/kaldi-asr/kaldi into transfer-learning-wsj-rm File Changes - *M* egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-0> (5) - *M* egs/rm/s5/RESULTS <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-1> (3) - *A* egs/rm/s5/local/chain/run_tdnn_wsj_rm.sh <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-2> (184) - *M* egs/rm/s5/local/online/run_nnet2_common.sh <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-3> (45) - *M* egs/wsj/s5/steps/libs/nnet3/train/chain_objf/acoustic_model.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-4> (14) - *M* egs/wsj/s5/steps/libs/nnet3/train/common.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-5> (3) - *M* egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-6> (59) - *M* egs/wsj/s5/steps/libs/nnet3/xconfig/utils.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-7> (8) - *M* egs/wsj/s5/steps/nnet3/chain/train.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-8> (8) - *M* egs/wsj/s5/steps/nnet3/train_dnn.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-9> (19) - *M* egs/wsj/s5/steps/nnet3/xconfig_to_configs.py <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-10> (47) - *M* src/nnet3/nnet-component-itf.h <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-11> (5) - *M* src/nnet3/nnet-utils.cc <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-12> (26) - *M* src/nnet3/nnet-utils.h <https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-13> (3) Patch Links: - https://github.com/kaldi-asr/kaldi/pull/1633.patch - https://github.com/kaldi-asr/kaldi/pull/1633.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1633>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu6bbFZ76KHKxTg_QELQx29Nizjwgks5r7NMfgaJpZM4Nf6J5> .

danpovey

Various requests for more documentation.
At this point it's not easy for me to see what you are doing, so I'm not expressing a view on the actual content.

danpovey · 2017-05-21T18:58:10Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

 # This function reads an xconfig file and returns it as a list of layers
 # (usually we use the variable name 'all_layers' elsewhere for this).
 # It will die if the xconfig file is empty or if there was
 # some error parsing it.
-def read_xconfig_file(xconfig_filename):


there is no documentation of aux_layers. Also they are components, not layers so the name is wrong.

danpovey · 2017-05-21T19:00:38Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+# This function reads existing model file with nnet3 format and returns it as
+# list of layers with name and dimension to be used as auxilary information
+# to generate xconfig.
+def read_model(model_filename):


there is no documentation of what the return format of this function is, and the name is too non-specific (e.g. get_model_component_info). Don't use the word 'layer' if what you really mean is component, they are different things. And grep for '^component-node', not -node. You could otherwise get problems in future if something that looks like that appears in the file.

I grep '-node', since we need both component-node and input-node!

danpovey · 2017-05-21T19:01:16Z

egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh

  steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig \
    --config-dir $dir/configs/ \
-    --nnet-edits="rename-node old-name=output-0 new-name=output"
+    --nnet-edits=$dir/configs/edits.config


nnet-edits expects a string, not a filename. also you are already supplying --config-dir as an arg to the script and I see that elsewhere you are treating edits.config as a standard part of that config directory, so there seems to be some disagreement about whether scripts are supposed to pick that up automatically, or whether it needs to be supplied explicitly.
If you are intending to change how that option works you should rename it and chang the documentation.

I modified the script to accept nnet-edits as filename and use it using edits-config. If you agree with that, I will update the description and option name as edits-config.
Also edits.config passed explicitly to xconfig_to_configs.py but train.py automatically picks up edits.config at prepare_acoustic_model function.

danpovey · 2017-05-21T19:01:31Z

egs/rm/s5/local/chain/run_tdnn_wsj_rm.sh

@@ -0,0 +1,184 @@
+#!/bin/bash
+set -e


please add explanation of what this script does.

danpovey · 2017-05-21T19:03:17Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

@@ -28,6 +28,12 @@ def get_args():
        epilog='Search egs/*/*/local/{nnet3,chain}/*sh for examples')
    parser.add_argument('--xconfig-file', required=True,
                        help='Filename of input xconfig file')
+    parser.add_argument('--existing-model',


The doc-string doesn't explain what this option expects, which is presumably a filename of a .mdl
file containing a previously trained neural net.

danpovey · 2017-05-21T19:04:13Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

@@ -219,17 +225,19 @@ def write_config_files(config_dir, all_layers):
            raise


-def add_back_compatibility_info(config_dir, nnet_edits=None):
+def add_back_compatibility_info(config_dir, existing_model=None,


This function is differently named in kaldi_52. Please make this PR against kaldi_52 as it will save me hassle in merging.

vimalmanohar · 2017-05-21T21:10:49Z

egs/wsj/s5/steps/nnet3/chain/train.py

@@ -310,7 +310,8 @@ def train(args, run_opts, background_process_handler):
        logger.info("Creating denominator FST")
        chain_lib.create_denominator_fst(args.dir, args.tree_dir, run_opts)

-    if (args.stage <= -4):
+    if (args.stage <= -4) and os.path.exists("{dir}/config/init.config".format(


The directory should be configs instead of config

danpovey · 2017-05-22T20:18:19Z

OK.

…

On Mon, May 22, 2017 at 4:09 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh <#1633 (comment)>: > steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig \ --config-dir $dir/configs/ \ - --nnet-edits="rename-node old-name=output-0 new-name=output" + --nnet-edits=$dir/configs/edits.config I modified the script to accept nnet-edits as filename and use it using edits-config. If you agree with that, I will update the description and option name as edits-config. Also edits.config passed explicitly to xconfig_to_configs.py but train.py automatically picks up edits.config at prepare_acoustic_model function. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1633 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu0JQFJYrRQ1vdBiI_wrwXHEjW2eKks5r8euQgaJpZM4Nf6J5> .

pegahgh · 2017-05-24T00:12:30Z

@vimalmanohar you mentioned you already tested this PR. Would you let me update me with your point about this PR.

vimalmanohar · 2017-05-24T02:56:35Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+                elif key == "input-dim":
+                    input_dim = int(value)
+                elif key == "output-dim":
+                    output_dim = int(value)


The key can also be "dim", in which case it would be both input-dim and output-dim.

vimalmanohar · 2017-05-24T02:59:24Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+            key_to_value['name'] = layer_name
+            if  input_dim != -1:
+                if output_dim == -1:
+                    # The layer is input layer type.


I am not sure what this is for. There may never be only input_dim defined with output_dim not defined.

vimalmanohar · 2017-05-24T03:01:09Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+                    # The layer is input layer type.
+                    key_to_value['dim'] = input_dim
+                elif input_str is not None:
+                    key_to_value['dim'] = output_dim


input_str is None for input-node, but its dim is still defined by 'dim'

vimalmanohar · 2017-05-24T03:03:32Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+    all_layers = []
+    try:
+        f = open(model_filename, 'r')
+    except Exeption as e:


pegahgh · 2017-06-09T16:52:00Z

@vimalmanohar I fixed some issues about this PR and merged it with master. Since you used this PR, do you have any comments on that?

vimalmanohar · 2017-06-16T23:23:02Z

egs/wsj/s5/steps/libs/nnet3/train/chain_objf/acoustic_model.py

@@ -424,6 +424,19 @@ def prepare_initial_acoustic_model(dir, run_opts, srand=-1):
    common_train_lib.prepare_initial_network(dir, run_opts,
                                             srand=srand)

+    # edits 0.raw using edits.config before adding transition model.


Add an example for edits.config here and explain that it is used for transfer learning.

vimalmanohar · 2017-06-16T23:26:26Z

egs/wsj/s5/steps/libs/nnet3/train/chain_objf/acoustic_model.py

+                    "".format(dir))
+        common_lib.execute_command(
+            """{command} {dir}/log/edit.log \
+                nnet3-copy --edits-config={edits_config} {dir}/0.raw \


I suggest you move this inside prepare_initial_network as this is not just for chain objf, but generic for anything.
Also you are modifying 0.raw itself. Its better to write an intermediate model like pre_edit.raw.

vimalmanohar · 2017-06-16T23:31:12Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+            if len(key_value) == 2:
+                key = key_value[0]
+                value = key_value[1]
+                if key == "name":


Give examples for different types of lines that will be parsed. input-node, component-node etc.

vimalmanohar · 2017-06-16T23:31:29Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+            key_to_value = dict()
+            layer_names.append(layer_name)
+            key_to_value['name'] = layer_name
+            if  input_dim != -1:


Comment which type of lines go to which case.

vimalmanohar · 2017-06-16T23:35:27Z

egs/wsj/s5/steps/libs/nnet3/xconfig/utils.py

@@ -60,6 +60,10 @@ def get_dim_from_layer_name(all_layers, current_layer, full_layer_name):
    for layer in all_layers:
        if layer is current_layer:
            break
+        # if "." used in layer name


Give example for when this is applicable.

vimalmanohar · 2017-06-16T23:35:59Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

@@ -219,17 +227,20 @@ def write_config_files(config_dir, all_layers):
            raise


-def add_nnet_context_info(config_dir, nnet_edits=None):
+def add_nnet_context_info(config_dir, existing_model=None,
+                                edits_config=None):


Check indentation.

vimalmanohar · 2017-06-16T23:36:56Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

+        model = """ - | nnet3-copy --edits-config={0} - {1}""".format(edits_config,
+                                                              model)
+    common_lib.execute_command("""nnet3-init {0} {1}/ref.config """
+                                 """ {2} """.format(existing_model if


Check indentation.
I am not sure if @danpovey wants this to be in Google standards, since the rest of this file is not.

Thanks for checking this, Vimal.
I'd like it to conform to Google standards as much as possible, but don't be too strict with the column-length of 80, it's hard to stick to that in Python, what with the indentation.

vimalmanohar · 2017-06-16T23:38:34Z

egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh

@@ -202,11 +202,13 @@ EOF
    echo " relu-renorm-layer name=prefinal-affine-lang-${lang_index} input=tdnn_bn dim=1024"


I this supposed to be in this PR? Also I do not see any WSJ -> RM examples.

…ers.

pegahgh · 2017-07-14T22:06:58Z

@vimalmanohar would you 1) review the PR and comment on that and also 2) test it on MGB to verify if you can get the best results you had?
The new weight transfer degrades performance on wsj->rm, but it gives good improvement on libri->wsj.
Since the PR name is transfer-learning-wsj-rm, I didn't add scripts for libri->wsj, but I can add them on next commit and rename PR.

vimalmanohar · 2017-07-16T23:41:52Z

egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh

-    --nnet-edits="rename-node old-name=output-0 new-name=output"
+    --edits-config=$dir/configs/edits.config
+
+  cat <<EOF >> $dir/configs/vars


Is this script relevant to this PR? Also include_log_softmax is no longer used.

I updated xconfig_to_config to accept edits-config instead of nnet-edit, so this script should be updated.

vimalmanohar · 2017-07-16T23:42:50Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm.sh

@@ -0,0 +1,192 @@
+#!/bin/bash


Is this different from _1a?

This is a soft link to _1a

This is not a softlink. Git will show it as softlink if it is. There is already local/chain/run_tdnn_wsj_rm.sh which is a softlink to _1a.

vimalmanohar · 2017-07-16T23:43:28Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1a.sh

+# and use already trained model on wsj and remove the last layer and
+# add new randomly initialized layer and retrain the whole network.
+# while training new added layer using rm data.
+# The chain config is as run_tdnn_5n.sh and the result is:


"is as" -> "is as in"

vimalmanohar · 2017-07-16T23:44:28Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1a.sh

+     /export/b0{3,4,5,6}/$USER/kaldi-data/egs/rm-$(date +'%m_%d_%H_%M')/s5c/$dir/egs/storage $dir/egs/storage
+  fi
+  echo "$0: set the learning-rate-factor for initial network to be zero."
+  nnet3-am-copy --raw=true --edits="set-learning-rate-factor name=* learning-rate-factor=$primary_lr_factor" \


I think you can use nnet3-copy instead of nnet3-am-copy --raw=true

vimalmanohar · 2017-07-16T23:49:25Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1b.sh

+  mkdir -p $dir
+  mkdir -p $dir/configs
+  cat <<EOF > $dir/configs/network.xconfig
+  output-layer name=output-tmp input=tdnn6.renorm dim=$num_targets


Can you create an empty xconfig file instead and also remove edits.config?

vimalmanohar · 2017-07-17T02:54:00Z

src/chainbin/chain-est-phone-lm.cc

@@ -32,37 +32,51 @@ int main(int argc, char *argv[]) {
        "Initialize un-smoothed phone language model for 'chain' training\n"
        "Output in FST format (epsilon-free deterministic acceptor)\n"
        "\n"
-        "Usage:  chain-est-phone-lm [options] <phone-seqs-rspecifier> <phone-lm-fst-out>\n"
+        "Usage:  chain-est-phone-lm [options] <phone-seqs-rspecifier-1> ... <phone-seqs-rspecifier-n> <phone-lm-fst-out>\n"


Use [ ] to indicate that they are optional.

vimalmanohar · 2017-07-17T02:55:37Z

src/nnet3/nnet-utils.cc

@@ -654,6 +654,32 @@ void ReadEditConfig(std::istream &edit_config_is, Nnet *nnet) {
        }
      }
      KALDI_LOG << "Set learning rates for " << num_learning_rates_set << " nodes.";
+    } else if (directive == "set-learning-rate-factor") {
+      std::string name_pattern = "*";
+      // name_pattern defaults to '*' if non is given.


non -> none

vimalmanohar · 2017-07-17T02:58:32Z

egs/wsj/s5/steps/nnet3/chain/train_more.py

+#           2016    Vimal Manohar
+# Apache 2.0.
+
+""" This script is based on steps/nnet3/chain/train.sh


Change the description

ok, I will do that, but I believe amount of changes to train.py is very low and we can still apply them to train.py!

Add in the description how this script is different from train.py and what it is for.

vimalmanohar · 2017-07-17T15:22:01Z

egs/wsj/s5/steps/nnet3/chain/train_more.py

+                                       args.lat_dir)
+
+    # Set some variables.
+    num_jobs = 4 #common_lib.get_number_of_jobs(args.lat_dir)


Why is this fixed at 4?

vimalmanohar · 2017-07-17T15:24:58Z

egs/wsj/s5/steps/nnet3/chain/train_more.py

+    logger.info("Copying the properties from {0} to {1}".format(egs_dir, args.dir))
+    common_train_lib.copy_egs_properties_to_exp_dir(egs_dir, args.dir)
+
+    if (args.stage <= -2) and os.path.exists(args.dir+"/configs/init.config"):


This stage is not required in this script. But you should check that init.raw exists.

I am not still sure, if it is good idea to have separate script train_more.py!!

danpovey · 2017-07-17T19:37:30Z

I don't believe there is a current use-case where init.config will exist-- unless you have something like that in the multilingual setup. We removed support for layer-by-layer pretraining from the xconfig creation scripts, and I think the training scripts no longer support it either (i.e. there is no support for adding new layers from layer1.config, layer2.config and so on).

…

On Mon, Jul 17, 2017 at 3:35 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/libs/nnet3/train/common.py <#1633 (comment)>: > @@ -502,13 +502,34 @@ def smooth_presoftmax_prior_scale_vector(pdf_counts, def prepare_initial_network(dir, run_opts, srand=-3): - if os.path.exists(dir+"/configs/init.config"): + if not os.path.exists("{0}/configs/init.config".format(dir)): I mean, we should change the code to check os.path.exists(dir+"/init.raw") not init.config Also We should have another assertion check to check at most one of the two files init.config and edit.config should exist. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1633 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu1iNMsT4NjlTzCZx0o7TibzCiK2qks5sO7eIgaJpZM4Nf6J5> .

danpovey · 2017-07-17T20:55:12Z

Oh yes, I forgot, init.config is expected to exist after all, in the normal case. But I don't like the way you are changing all these things that are mixed up in the normal code path-- it's not clear in your changed code what relates to the transfer learning (or training an already-changed system) and what does not. That needs to be clarified and it needs to be easy to verify, from reading the changes on github, that for non-transfer learning, it would behave the same as previously.

…

On Mon, Jul 17, 2017 at 4:44 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/libs/nnet3/train/common.py <#1633 (comment)>: > @@ -502,13 +502,34 @@ def smooth_presoftmax_prior_scale_vector(pdf_counts, def prepare_initial_network(dir, run_opts, srand=-3): - if os.path.exists(dir+"/configs/init.config"): + if not os.path.exists("{0}/configs/init.config".format(dir)): We still check existence of init.config and use it for stage -4 (init.raw generation) and stage -2 (lda generation) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1633 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVuwvCRMUJ5DYOxF5uijagkHAh4k9Eks5sO8fEgaJpZM4Nf6J5> .

danpovey

Here are a lot of comments-- after reading through only part of the PR.

But the closer I look at it, the less happy I am with the architecture. My particular concern is the way in which the already-trained models are modified. The way the edits-config is used, and the interaction with the train.py scripts, makes it very confusing. And I think you are using "rename-node" in an unnecessary way where it would be better to just overwrite the old layer name.

I think a better approach than this, would be to have the example script in local/ create a model file directly, e.g. named $dir/input.raw, and pass it in as an optional argument to the training scripts; if this is supplied to the training scripts, then the configs/ directory would not be interrogated (and they wouldn't attempt to initialize the LDA-like matrix).

I'm pondering whether it might be a better idea, even for the existing training scripts, to better separate the model initialization and the training, so that the input would be a .raw file (with LDA layer already initialized), instead of having the training scripts do this. But we can do this as a separate thread.

danpovey · 2017-07-30T00:23:34Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1b.sh

+
+# configs for transfer learning
+common_egs_dir=
+#srcdir=../../wsj/s5/


you should probably replace your own directory with the original $srcdir.
Also, for all scripts like this, I think you should check at the top of the script whether the required inputs, such as $src_mdl, are present.

danpovey · 2017-07-30T00:23:51Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1b.sh

+srcdir=/export/a09/pegahgh/kaldi-transfer-learning/egs/wsj/s5-sp
+src_mdl=$srcdir/exp/chain/tdnn1d_sp/final.mdl
+src_lang=$srcdir/data/lang
+src_gmm_mdl=$srcdir/exp/tri4b


why call this _mdl if it's a directory? It should be src_gmm_dir.

danpovey · 2017-07-30T00:25:28Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1b.sh

+# while training new added layer using rm data.
+# The chain config is as run_tdnn_5n.sh and the result is:
+# System tdnn_5n tdnn_wsj_rm_1a tdnn_wsj_rm_1b tdnn_wsj_rm_1c
+# WER      2.71     2.09            3.45          3.38


You should explain more about why you added the 1b and 1c scripts. I can see that the lexicon in this case may have words without pronunciations, since you use WSJ's lexicon. I guess you added these to demonstrate different ways to do it. But you're not clearly explaining what the difference is.
You say it uses " a src-tree-dir to generate new target alignment and lattices using source model".
But it seems to me a key difference that you didn't mention is that it uses the lexicon from WSJ. I'd like a clear explanation of what 1c is as well.

danpovey · 2017-07-30T00:30:36Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1a.sh

+    utils/create_split_dir.pl \
+     /export/b0{3,4,5,6}/$USER/kaldi-data/egs/rm-$(date +'%m_%d_%H_%M')/s5c/$dir/egs/storage $dir/egs/storage
+  fi
+  echo "$0: set the learning-rate-factor for initial network to be zero."


this 'zero' in the echo statement is not right. also I don't see that an echo statement is helpful here, you can use a comment.

danpovey · 2017-07-30T00:35:04Z

egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1a.sh

+     /export/b0{3,4,5,6}/$USER/kaldi-data/egs/rm-$(date +'%m_%d_%H_%M')/s5c/$dir/egs/storage $dir/egs/storage
+  fi
+  echo "$0: set the learning-rate-factor for initial network to be zero."
+  nnet3-copy --edits="set-learning-rate-factor name=* learning-rate-factor=$primary_lr_factor" \


I don't see what the point of the primary_lr_factor is here-- if you are setting a learning-rate factor for all layers, IMO it would make more sense to just multiply the initial and final learning rates by that factor..

danpovey · 2017-07-30T01:46:42Z

egs/wsj/s5/steps/libs/nnet3/train/chain_objf/acoustic_model.py

@@ -20,32 +21,64 @@
 logger.addHandler(logging.NullHandler())


-def create_phone_lm(dir, tree_dir, run_opts, lm_opts=None):
-    """Create a phone LM for chain training
+def create_phone_lm(dir, tree_dir, run_opts, alignment_dirs=None, lm_opts=None):


Please revert your changes to this function and create another version of this function that takes its input from multiple alignment dirs. This will mean that we don't need to so carefully test whether this will break the normal code path; it also keeps the individual functions simple.
Also revert your changes to chain-est-phone-lm; what you need to do can be implemented much more simply.
You can just run a command for each source alignment dir that uses ali-to-phones to dump the phone sequences into a single gzipped archive for that source. And then you can combine the gzipped archives while repeating some of them the requisite number of times, e.g. "gunzip -c $dir/src1_phones.gz $dir/src2_phones.gz $dir/src2_phones.gz|", assuming the second source was repeated twice.

Vimal pointed out that this might fail due to argument list too long. But you can work around that by doing:
for x in [arg_list]; do gunzip -c $x; done | ...
I don't like the C++ code changes because having a scale that is required to be an integer is kind of weird, and because IMO it's just more elegant to repeat the input that many times, since that is what the code does. And also because of the principle that the "unusual case" shouldn't cause code to be modified in things used in the "common case". I want to keep the common case simple, which makes the codebase easier to maintain.

danpovey · 2017-07-30T02:06:17Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

-                               "{0}/ref.raw".format(config_dir))
+def add_nnet_context_info(config_dir, existing_model=None,
+                          edits_config=None):
+    """This will be removed when python script refactoring is done."""


it seems like you have resolved a git conflict wrongly, this comment is an old comment that we removed.

danpovey · 2017-07-30T02:08:13Z

egs/wsj/s5/steps/nnet3/chain/train_more.py

+    right_context_final = (args.chunk_right_context_final + model_right_context if
+                           args.chunk_right_context_final >= 0 else -1)
+
+    # Initialize as "raw" nnet, prior to training the LDA-like preconditioning


this comment seems misplaced.

danpovey · 2017-07-30T02:12:18Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

                        "creating does not have an output node called 'output' "
                        "(e.g. for multilingual setups).  You can set this to "
-                        "an edit-string like: "
+                        "an edits-config can contain string like: "


This comment still doesn't really explain what the option does. Explaining what an option is useful for is not a substitute for saying what it does.
Also you should say edits-config filename not config filename, because they are different things.
But I think you should just remove this option completely, because I think this option is based on a misunderstanding. I think you didn't realize that it's possible to overwrite nodes and components by creating a new node with the old one's name. Just, if an older layer was of a different type (e.g. LSTM vs. TDNN), there may be confusingly named 'orphone' components and component-nodes left around, so it may make sense to do 'remove-orphans' in this situation. You could have this code do remove-orphans automatically in the case when you provide an initial nnet to this script.

danpovey · 2017-07-30T02:16:14Z

egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

-                                                          model)
-    out = common_lib.get_command_stdout('nnet3-info "{0}" | head -n 4 '
-                                        .format(model))
+    if edits_config is not None:


this whole code block is unreadable and very unclear. And It doesn't look correct to me either (the - " at the beginning).

vimalmanohar · 2017-08-25T20:03:18Z

egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py

+        i.e. layer in *.mdl -> corresponding 'XconfigExistingLayer' layer
+             'input-node name=ivector dim=100' ->
+             'existing name=ivector dim=100'
+             'component-node name=tdnn1.affine ** input-node=1000 '


input-node -> input-dim and output-node -> output-dim

Update xconfig_to_configs.py

Update make_weighted_den_fst.sh

danpovey · 2017-09-05T21:11:30Z

egs/wsj/s5/steps/libs/nnet3/xconfig/basic_layers.py

+class XconfigExistingLayer(XconfigLayerBase):
+    """
+    This class is for lines like 
+    'existing name=tdnn1.affine dim=40'.


This documentation is not meaningful. Where I wrote:

This class is for lines like xxxxxx

the line xxxxx would appear in an xconfig file. But these wouldn't appear in an xconfig file.
I don't want to have to give you the documentation word for word. Figure it out.

danpovey · 2017-09-05T21:16:53Z

egs/wsj/s5/steps/nnet3/chain/make_weighted_den_fst.sh

+  echo "e.g.: $0 exp/tri1_ali exp/tri2_ali exp/chain/tdnn_1a_sp";
+  echo "Options: "
+  echo " --cmd (run.pl|queue.pl...)      # specify how to run the sub-processes.";
+  echo "--lm-opts                        # options for phone LM generation";


please document the 'weights' here. It's important so it should go in the usage message as well as in the comment.

danpovey

Some more mostly cosmetic comments.

danpovey · 2017-09-05T21:19:11Z

egs/wsj/s5/steps/nnet3/chain/make_weighted_den_fst.sh

+if [ -z $weights ]; then
+  # If 'weights' is not specified, comma-separated array '1' with dim
+  #'num_alignments' is defined as 'weights'.
+  for n in `seq 1 $num_alignments`;do weights="$weights,1"; done


this doesn't look right, there would be a leading comma.
And it's not ignored by cut:

echo ,a,b,c | cut -d, -f1 mac:nnet3: echo ,a,b,c | cut -d, -f2 a

That means this code path was not tested!

@pegahgh Since you are already creating w_arr below, you can use w_arr = (); and then for num_alignment times do w_arr += (1.0). Then in line 90, you can use w=${w_arr[$n]}

danpovey · 2017-09-05T21:20:36Z

egs/wsj/s5/steps/nnet3/chain/make_weighted_den_fst.sh

+    rm $adir/alignment_files.txt 2>/dev/null || true
+    for x in `seq $w`;do
+      for j in `seq $num_jobs`;do
+        echo $adir/ali.$j.gz >> $adir/alignment_files.txt


You are making this more complicated than it needs to be by writing to separate files. Write them all to the same file, $dir/alignment_files.txt. Then you don't need a loop in the $cmd script.

I need loop, since we need different final.mdl for "ali-to-phones $adir/final.mdl"

I think the complexity of this command is underestimated, and this is being made more difficult that it previously was.

@pegahgh it is better to write the alignments in $dir/alignment.$n.txt instead of inside $adir, since you may not have write access to that. Also there would be race-conditions if multiple experiments are run using the same alignment directories.

I can also write model in the file in a same line as alignment is written. Then we can have single file.

There will still be one loop inside the command. There needs to be more escaping to split each line into model and alignment, which might again make it more complicated.

danpovey · 2017-09-05T21:22:37Z

src/nnet3/nnet-utils.cc

+        }
+      }
+      KALDI_LOG << "Set learning rate factors for " << num_learning_rate_factors_set
+                << " nodes.";


change 'nodes' to 'components' since the learning rate factor applies to components, not nodes.

danpovey · 2017-09-05T21:22:46Z

src/nnet3/nnet-utils.h

@@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config,
       Note: this sets the 'underlying' learning rate, i.e. it will get
       multiplied by any 'learning-rate-factor' set in the nodes.


nodes -> components

danpovey · 2017-09-05T21:22:55Z

src/nnet3/nnet-utils.h

@@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config,
       Note: this sets the 'underlying' learning rate, i.e. it will get
       multiplied by any 'learning-rate-factor' set in the nodes.

+    set-learning-rate-factor [name=<name-pattern>] learning-rate-factor=<learning-rate-factor>
+       Sets the learning rate factor for any updatable nodes matching the name pattern.


nodes -> components

I agree! Thanks, also probably I should change this issue for set-learning-rate. Do you agree?

danpovey · 2017-09-05T22:07:56Z

Oh, OK as-is then.

…

On Tue, Sep 5, 2017 at 3:06 PM, Vimal Manohar ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/nnet3/chain/make_weighted_den_fst.sh <#1633 (comment)>: > + echo "number of alignment directories, $num_alignments." && exit 1; + fi +fi + +if [ $stage -le 1 ]; then + for n in `seq 0 $[num_alignments-1]`; do + w=$(echo $weights | cut -d, -f$[$n+1]) + adir=${ali_dirs[$n]} + num_jobs=$(cat $adir/num_jobs) + if ! [[ $w =~ ^[+]?[0-9]+$ ]] ; then + echo "no positive int weight specified for alignment ${ali_dirs[$n]}" && exit 1; + fi + rm $adir/alignment_files.txt 2>/dev/null || true + for x in `seq $w`;do + for j in `seq $num_jobs`;do + echo $adir/ali.$j.gz >> $adir/alignment_files.txt There will still be one loop inside the command. There needs to be more escaping to split each line into model and alignment, which might again make it more complicated. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1633 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu3Koy8L56VSROj7K6jsHSX6GRNThks5sfcX7gaJpZM4Nf6J5> .

danpovey · 2017-09-05T22:16:00Z

yes.

…

On Tue, Sep 5, 2017 at 3:15 PM, pegahgh ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/nnet3/nnet-utils.h <#1633 (comment)>: > @@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config, Note: this sets the 'underlying' learning rate, i.e. it will get multiplied by any 'learning-rate-factor' set in the nodes. + set-learning-rate-factor [name=<name-pattern>] learning-rate-factor=<learning-rate-factor> + Sets the learning rate factor for any updatable nodes matching the name pattern. I agree! Thanks, also probably I should change this issue for set-learning-rate. Do you agree? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1633 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu8Y9YxZWItE09YIoyKlSaFxqTnROks5sfcf8gaJpZM4Nf6J5> .

… script

Update make_weighted_den_fst.sh

* 'master' of https://github.com/kaldi-asr/kaldi: (43 commits) [src,scripts,egs] Transfer learning for ASR with nnet3 (kaldi-asr#1633) [src,scripts,egs] Attention modeling, with example scripts (kaldi-asr#1731) [src] Fix bug in block matrix addition (thanks: Sidhi Adkoli). [egs] Fix inconseqential input-checking bug in Swbd example script (kaldi-asr#1886) [build] dependency-check: that python2.7 and python3 exist and 2.7 is default (kaldi-asr#1876) [scripts] A cosmetic change to info messages in chain training (kaldi-asr#1880) [doc] Keep tutorial code up to date (thanks: Luwei Yang) [scripts] Bug-fix in long-utterance-segmentation script (thanks: Armin Oliya) (kaldi-asr#1877) [egs] Fixed some issues in the multilingual BABEL example scripts (kaldi-asr#1850) [build] Cosmetic fix in Makefile Remove memory leaks and unused variables (when CUDA is not enabled) (kaldi-asr#1866) [scripts] Fix default for egs.cmd in nnet3 training scripts (kaldi-asr#1865) [doc] Fix to how documentation is built (thanks: David van Leeuwen) [scripts] Add --decode-extra-opts in steps/decode.sh (required for speech activity detection scripts) (kaldi-asr#1859) [src] Adding documentation for lattice discriminative training functions (kaldi-asr#1854) [src] Typo fixes in documenation. (kaldi-asr#1857) [egs] Update to score.sh in fisher_swbd setup, allow --iter option (kaldi-asr#1853) [scripts] bug-fix in TFRNNLM rescoring script (no 'ark' needed for unk.probs file) (kaldi-asr#1851) [src] Remove repeated parameter documentation. (kaldi-asr#1849) [egs] Aspire example scripts: Update autoencoder example to xconfig (kaldi-asr#1847) ...

pegahgh added 2 commits May 18, 2017 18:19

added Transfer learning setup using nnet3+chain+tdnn for WSJ->RM.

d97edb6

Merge branch 'master' of https://github.com/kaldi-asr/kaldi into tran…

1c25c88

…sfer-learning-wsj-rm Conflicts: egs/wsj/s5/steps/nnet3/xconfig_to_configs.py

danpovey reviewed May 21, 2017

View reviewed changes

vimalmanohar reviewed May 21, 2017

View reviewed changes

pegahgh added 2 commits May 22, 2017 17:34

fixed some issues w.r.t comments.

b5fe795

merged with kaldi_52 and fixed some incompatibility issues.

947f05c

danpovey changed the base branch from master to kaldi_52 May 23, 2017 21:35

pegahgh force-pushed the transfer-learning-wsj-rm branch from 947f05c to 7872a9e Compare May 23, 2017 23:57

vimalmanohar reviewed May 24, 2017

View reviewed changes

pegahgh changed the base branch from kaldi_52 to master June 7, 2017 18:57

vimalmanohar reviewed Jun 16, 2017

View reviewed changes

pegahgh added 3 commits July 14, 2017 14:58

added scripts for new weight transfer method for transferring all lay…

545154a

…ers.

merged with master

5248c1a

updated PR w.r.t comments.

40c85dc

pegahgh force-pushed the transfer-learning-wsj-rm branch from df02d6d to 40c85dc Compare July 14, 2017 22:03

small fix to parser.py.

39a731f

vimalmanohar reviewed Jul 17, 2017

View reviewed changes

pegahgh added 2 commits July 17, 2017 17:05

fixed issues w.r.t. comments (except prepare_wsj_rm_lang.sh).

72480ec

fixed small issue with language-model.*.

e51826a

danpovey reviewed Jul 30, 2017

View reviewed changes

pegahgh added 3 commits August 23, 2017 20:13

small fix.

eb00983

fixed old comments and added new comments.

ef7275b

fixed some issues in python codes using pylint package.

82fa510

vimalmanohar reviewed Aug 25, 2017

View reviewed changes

pegahgh and others added 8 commits August 26, 2017 18:16

Update parser.py

a856dea

Update run_tdnn_wsj_rm_1c.sh

55a64ff

Update basic_layers.py

c2593d8

Update parser.py

26b4ddd

Update xconfig_to_configs.py

82daf84

Merge pull request #12 from vimalmanohar/patch-4

f88a115

Update xconfig_to_configs.py

Update make_weighted_den_fst.sh

ed63b19

Merge pull request #13 from vimalmanohar/patch-5

43d1fe2

Update make_weighted_den_fst.sh

danpovey reviewed Sep 5, 2017

View reviewed changes

pegahgh and others added 10 commits September 5, 2017 20:37

fixed small issues.

125abf0

fixed small issue.

f51492b

modified make_weighted_den_fst.sh

ba308ea

modified weighted_den_fst.sh

8fae871

fixed some issues.

6f5e8eb

fixed some small issues.

3985924

Merge branch 'master' into transfer-learning-wsj-rm

17bb56f

[scripts] Cosmetic and other improvements to make_weighted_den_fst.sh…

fe07c0b

… script

Update make_weighted_den_fst.sh

02dee51

Merge pull request kaldi-asr#45 from vimalmanohar/patch-7

2c00037

Update make_weighted_den_fst.sh

danpovey merged commit 82686ee into kaldi-asr:master Sep 15, 2017

Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018

[src,scripts,egs] Transfer learning for ASR with nnet3 (kaldi-asr#1633)

e88f4a8

		@@ -202,11 +202,13 @@ EOF
		echo " relu-renorm-layer name=prefinal-affine-lang-${lang_index} input=tdnn_bn dim=1024"

		@@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config,
		Note: this sets the 'underlying' learning rate, i.e. it will get
		multiplied by any 'learning-rate-factor' set in the nodes.

Transfer learning wsj->rm #1633

Transfer learning wsj->rm #1633

Conversation

pegahgh commented May 18, 2017

danpovey commented May 18, 2017 via email

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented May 22, 2017 via email

pegahgh commented May 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegahgh commented Jun 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegahgh commented Jul 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegahgh Jul 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Jul 17, 2017 via email

danpovey commented Jul 17, 2017 via email

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vimalmanohar Sep 5, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegahgh Jul 17, 2017 •

edited

Loading

vimalmanohar Sep 5, 2017 •

edited

Loading