-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer learning wsj->rm #1633
Conversation
…sfer-learning-wsj-rm Conflicts: egs/wsj/s5/steps/nnet3/xconfig_to_configs.py
Thanks-- I'll try to look at this tonight.
…On Thu, May 18, 2017 at 7:24 PM, pegahgh ***@***.***> wrote:
This is the Transfer learning setup which transfers the model trained on
wsj to rm.
The preliminary result without any tuning shows improvement(2.71 -> 2.21)
------------------------------
You can view, comment on, or merge this pull request online at:
#1633
Commit Summary
- added Transfer learning setup using nnet3+chain+tdnn for WSJ->RM.
- Merge branch 'master' of https://github.com/kaldi-asr/kaldi into
transfer-learning-wsj-rm
File Changes
- *M* egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-0> (5)
- *M* egs/rm/s5/RESULTS
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-1> (3)
- *A* egs/rm/s5/local/chain/run_tdnn_wsj_rm.sh
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-2> (184)
- *M* egs/rm/s5/local/online/run_nnet2_common.sh
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-3> (45)
- *M* egs/wsj/s5/steps/libs/nnet3/train/chain_objf/acoustic_model.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-4> (14)
- *M* egs/wsj/s5/steps/libs/nnet3/train/common.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-5> (3)
- *M* egs/wsj/s5/steps/libs/nnet3/xconfig/parser.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-6> (59)
- *M* egs/wsj/s5/steps/libs/nnet3/xconfig/utils.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-7> (8)
- *M* egs/wsj/s5/steps/nnet3/chain/train.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-8> (8)
- *M* egs/wsj/s5/steps/nnet3/train_dnn.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-9> (19)
- *M* egs/wsj/s5/steps/nnet3/xconfig_to_configs.py
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-10> (47)
- *M* src/nnet3/nnet-component-itf.h
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-11> (5)
- *M* src/nnet3/nnet-utils.cc
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-12> (26)
- *M* src/nnet3/nnet-utils.h
<https://github.com/kaldi-asr/kaldi/pull/1633/files#diff-13> (3)
Patch Links:
- https://github.com/kaldi-asr/kaldi/pull/1633.patch
- https://github.com/kaldi-asr/kaldi/pull/1633.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1633>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu6bbFZ76KHKxTg_QELQx29Nizjwgks5r7NMfgaJpZM4Nf6J5>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Various requests for more documentation.
At this point it's not easy for me to see what you are doing, so I'm not expressing a view on the actual content.
# This function reads an xconfig file and returns it as a list of layers | ||
# (usually we use the variable name 'all_layers' elsewhere for this). | ||
# It will die if the xconfig file is empty or if there was | ||
# some error parsing it. | ||
def read_xconfig_file(xconfig_filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no documentation of aux_layers. Also they are components, not layers so the name is wrong.
# This function reads existing model file with nnet3 format and returns it as | ||
# list of layers with name and dimension to be used as auxilary information | ||
# to generate xconfig. | ||
def read_model(model_filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no documentation of what the return format of this function is, and the name is too non-specific (e.g. get_model_component_info). Don't use the word 'layer' if what you really mean is component, they are different things. And grep for '^component-node', not -node. You could otherwise get problems in future if something that looks like that appears in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I grep '-node', since we need both component-node and input-node!
steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig \ | ||
--config-dir $dir/configs/ \ | ||
--nnet-edits="rename-node old-name=output-0 new-name=output" | ||
--nnet-edits=$dir/configs/edits.config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nnet-edits expects a string, not a filename. also you are already supplying --config-dir as an arg to the script and I see that elsewhere you are treating edits.config as a standard part of that config directory, so there seems to be some disagreement about whether scripts are supposed to pick that up automatically, or whether it needs to be supplied explicitly.
If you are intending to change how that option works you should rename it and chang the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the script to accept nnet-edits as filename and use it using edits-config. If you agree with that, I will update the description and option name as edits-config.
Also edits.config passed explicitly to xconfig_to_configs.py but train.py automatically picks up edits.config at prepare_acoustic_model function.
@@ -0,0 +1,184 @@ | |||
#!/bin/bash | |||
set -e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add explanation of what this script does.
@@ -28,6 +28,12 @@ def get_args(): | |||
epilog='Search egs/*/*/local/{nnet3,chain}/*sh for examples') | |||
parser.add_argument('--xconfig-file', required=True, | |||
help='Filename of input xconfig file') | |||
parser.add_argument('--existing-model', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc-string doesn't explain what this option expects, which is presumably a filename of a .mdl
file containing a previously trained neural net.
@@ -219,17 +225,19 @@ def write_config_files(config_dir, all_layers): | |||
raise | |||
|
|||
|
|||
def add_back_compatibility_info(config_dir, nnet_edits=None): | |||
def add_back_compatibility_info(config_dir, existing_model=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is differently named in kaldi_52. Please make this PR against kaldi_52 as it will save me hassle in merging.
@@ -310,7 +310,8 @@ def train(args, run_opts, background_process_handler): | |||
logger.info("Creating denominator FST") | |||
chain_lib.create_denominator_fst(args.dir, args.tree_dir, run_opts) | |||
|
|||
if (args.stage <= -4): | |||
if (args.stage <= -4) and os.path.exists("{dir}/config/init.config".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The directory should be configs
instead of config
OK.
…On Mon, May 22, 2017 at 4:09 PM, pegahgh ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/babel_multilang/s5/local/nnet3/run_tdnn_multilingual.sh
<#1633 (comment)>:
> steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig \
--config-dir $dir/configs/ \
- --nnet-edits="rename-node old-name=output-0 new-name=output"
+ --nnet-edits=$dir/configs/edits.config
I modified the script to accept nnet-edits as filename and use it using
edits-config. If you agree with that, I will update the description and
option name as edits-config.
Also edits.config passed explicitly to xconfig_to_configs.py but train.py
automatically picks up edits.config at prepare_acoustic_model function.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu0JQFJYrRQ1vdBiI_wrwXHEjW2eKks5r8euQgaJpZM4Nf6J5>
.
|
947f05c
to
7872a9e
Compare
@vimalmanohar you mentioned you already tested this PR. Would you let me update me with your point about this PR. |
elif key == "input-dim": | ||
input_dim = int(value) | ||
elif key == "output-dim": | ||
output_dim = int(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key can also be "dim", in which case it would be both input-dim and output-dim.
key_to_value['name'] = layer_name | ||
if input_dim != -1: | ||
if output_dim == -1: | ||
# The layer is input layer type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what this is for. There may never be only input_dim defined with output_dim not defined.
# The layer is input layer type. | ||
key_to_value['dim'] = input_dim | ||
elif input_str is not None: | ||
key_to_value['dim'] = output_dim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_str is None for input-node, but its dim is still defined by 'dim'
all_layers = [] | ||
try: | ||
f = open(model_filename, 'r') | ||
except Exeption as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception
@vimalmanohar I fixed some issues about this PR and merged it with master. Since you used this PR, do you have any comments on that? |
@@ -424,6 +424,19 @@ def prepare_initial_acoustic_model(dir, run_opts, srand=-1): | |||
common_train_lib.prepare_initial_network(dir, run_opts, | |||
srand=srand) | |||
|
|||
# edits 0.raw using edits.config before adding transition model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example for edits.config here and explain that it is used for transfer learning.
"".format(dir)) | ||
common_lib.execute_command( | ||
"""{command} {dir}/log/edit.log \ | ||
nnet3-copy --edits-config={edits_config} {dir}/0.raw \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you move this inside prepare_initial_network as this is not just for chain objf, but generic for anything.
Also you are modifying 0.raw itself. Its better to write an intermediate model like pre_edit.raw.
if len(key_value) == 2: | ||
key = key_value[0] | ||
value = key_value[1] | ||
if key == "name": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give examples for different types of lines that will be parsed. input-node, component-node etc.
key_to_value = dict() | ||
layer_names.append(layer_name) | ||
key_to_value['name'] = layer_name | ||
if input_dim != -1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment which type of lines go to which case.
@@ -60,6 +60,10 @@ def get_dim_from_layer_name(all_layers, current_layer, full_layer_name): | |||
for layer in all_layers: | |||
if layer is current_layer: | |||
break | |||
# if "." used in layer name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give example for when this is applicable.
@@ -219,17 +227,20 @@ def write_config_files(config_dir, all_layers): | |||
raise | |||
|
|||
|
|||
def add_nnet_context_info(config_dir, nnet_edits=None): | |||
def add_nnet_context_info(config_dir, existing_model=None, | |||
edits_config=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check indentation.
model = """ - | nnet3-copy --edits-config={0} - {1}""".format(edits_config, | ||
model) | ||
common_lib.execute_command("""nnet3-init {0} {1}/ref.config """ | ||
""" {2} """.format(existing_model if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check indentation.
I am not sure if @danpovey wants this to be in Google standards, since the rest of this file is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking this, Vimal.
I'd like it to conform to Google standards as much as possible, but don't be too strict with the column-length of 80, it's hard to stick to that in Python, what with the indentation.
@@ -202,11 +202,13 @@ EOF | |||
echo " relu-renorm-layer name=prefinal-affine-lang-${lang_index} input=tdnn_bn dim=1024" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I this supposed to be in this PR? Also I do not see any WSJ -> RM examples.
df02d6d
to
40c85dc
Compare
@vimalmanohar would you 1) review the PR and comment on that and also 2) test it on MGB to verify if you can get the best results you had? |
--nnet-edits="rename-node old-name=output-0 new-name=output" | ||
--edits-config=$dir/configs/edits.config | ||
|
||
cat <<EOF >> $dir/configs/vars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this script relevant to this PR? Also include_log_softmax is no longer used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated xconfig_to_config to accept edits-config instead of nnet-edit, so this script should be updated.
@@ -0,0 +1,192 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this different from _1a?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a soft link to _1a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a softlink. Git will show it as softlink if it is. There is already local/chain/run_tdnn_wsj_rm.sh which is a softlink to _1a.
# and use already trained model on wsj and remove the last layer and | ||
# add new randomly initialized layer and retrain the whole network. | ||
# while training new added layer using rm data. | ||
# The chain config is as run_tdnn_5n.sh and the result is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is as" -> "is as in"
/export/b0{3,4,5,6}/$USER/kaldi-data/egs/rm-$(date +'%m_%d_%H_%M')/s5c/$dir/egs/storage $dir/egs/storage | ||
fi | ||
echo "$0: set the learning-rate-factor for initial network to be zero." | ||
nnet3-am-copy --raw=true --edits="set-learning-rate-factor name=* learning-rate-factor=$primary_lr_factor" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use nnet3-copy
instead of nnet3-am-copy --raw=true
mkdir -p $dir | ||
mkdir -p $dir/configs | ||
cat <<EOF > $dir/configs/network.xconfig | ||
output-layer name=output-tmp input=tdnn6.renorm dim=$num_targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create an empty xconfig file instead and also remove edits.config?
src/chainbin/chain-est-phone-lm.cc
Outdated
@@ -32,37 +32,51 @@ int main(int argc, char *argv[]) { | |||
"Initialize un-smoothed phone language model for 'chain' training\n" | |||
"Output in FST format (epsilon-free deterministic acceptor)\n" | |||
"\n" | |||
"Usage: chain-est-phone-lm [options] <phone-seqs-rspecifier> <phone-lm-fst-out>\n" | |||
"Usage: chain-est-phone-lm [options] <phone-seqs-rspecifier-1> ... <phone-seqs-rspecifier-n> <phone-lm-fst-out>\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use [ ] to indicate that they are optional.
src/nnet3/nnet-utils.cc
Outdated
@@ -654,6 +654,32 @@ void ReadEditConfig(std::istream &edit_config_is, Nnet *nnet) { | |||
} | |||
} | |||
KALDI_LOG << "Set learning rates for " << num_learning_rates_set << " nodes."; | |||
} else if (directive == "set-learning-rate-factor") { | |||
std::string name_pattern = "*"; | |||
// name_pattern defaults to '*' if non is given. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non -> none
# 2016 Vimal Manohar | ||
# Apache 2.0. | ||
|
||
""" This script is based on steps/nnet3/chain/train.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will do that, but I believe amount of changes to train.py is very low and we can still apply them to train.py!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add in the description how this script is different from train.py and what it is for.
args.lat_dir) | ||
|
||
# Set some variables. | ||
num_jobs = 4 #common_lib.get_number_of_jobs(args.lat_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this fixed at 4?
logger.info("Copying the properties from {0} to {1}".format(egs_dir, args.dir)) | ||
common_train_lib.copy_egs_properties_to_exp_dir(egs_dir, args.dir) | ||
|
||
if (args.stage <= -2) and os.path.exists(args.dir+"/configs/init.config"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stage is not required in this script. But you should check that init.raw exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not still sure, if it is good idea to have separate script train_more.py!!
I don't believe there is a current use-case where init.config will exist--
unless you have something like that in the multilingual setup.
We removed support for layer-by-layer pretraining from the xconfig creation
scripts, and I think the training scripts no longer support it either (i.e.
there is no support for adding new layers from layer1.config, layer2.config
and so on).
…On Mon, Jul 17, 2017 at 3:35 PM, pegahgh ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/wsj/s5/steps/libs/nnet3/train/common.py
<#1633 (comment)>:
> @@ -502,13 +502,34 @@ def smooth_presoftmax_prior_scale_vector(pdf_counts,
def prepare_initial_network(dir, run_opts, srand=-3):
- if os.path.exists(dir+"/configs/init.config"):
+ if not os.path.exists("{0}/configs/init.config".format(dir)):
I mean, we should change the code to check os.path.exists(dir+"/init.raw")
not init.config
Also We should have another assertion check to check at most one of the
two files init.config and edit.config should exist.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu1iNMsT4NjlTzCZx0o7TibzCiK2qks5sO7eIgaJpZM4Nf6J5>
.
|
Oh yes, I forgot, init.config is expected to exist after all, in the normal
case.
But I don't like the way you are changing all these things that are mixed
up in the normal code path-- it's not clear in your changed code what
relates to the transfer learning (or training an already-changed system)
and what does not. That needs to be clarified and it needs to be easy to
verify, from reading the changes on github, that for non-transfer learning,
it would behave the same as previously.
…On Mon, Jul 17, 2017 at 4:44 PM, pegahgh ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/wsj/s5/steps/libs/nnet3/train/common.py
<#1633 (comment)>:
> @@ -502,13 +502,34 @@ def smooth_presoftmax_prior_scale_vector(pdf_counts,
def prepare_initial_network(dir, run_opts, srand=-3):
- if os.path.exists(dir+"/configs/init.config"):
+ if not os.path.exists("{0}/configs/init.config".format(dir)):
We still check existence of init.config and use it for stage -4 (init.raw
generation) and stage -2 (lda generation)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuwvCRMUJ5DYOxF5uijagkHAh4k9Eks5sO8fEgaJpZM4Nf6J5>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a lot of comments-- after reading through only part of the PR.
But the closer I look at it, the less happy I am with the architecture. My particular concern is the way in which the already-trained models are modified. The way the edits-config is used, and the interaction with the train.py scripts, makes it very confusing. And I think you are using "rename-node" in an unnecessary way where it would be better to just overwrite the old layer name.
I think a better approach than this, would be to have the example script in local/ create a model file directly, e.g. named $dir/input.raw, and pass it in as an optional argument to the training scripts; if this is supplied to the training scripts, then the configs/ directory would not be interrogated (and they wouldn't attempt to initialize the LDA-like matrix).
I'm pondering whether it might be a better idea, even for the existing training scripts, to better separate the model initialization and the training, so that the input would be a .raw file (with LDA layer already initialized), instead of having the training scripts do this. But we can do this as a separate thread.
|
||
# configs for transfer learning | ||
common_egs_dir= | ||
#srcdir=../../wsj/s5/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should probably replace your own directory with the original $srcdir.
Also, for all scripts like this, I think you should check at the top of the script whether the required inputs, such as $src_mdl, are present.
srcdir=/export/a09/pegahgh/kaldi-transfer-learning/egs/wsj/s5-sp | ||
src_mdl=$srcdir/exp/chain/tdnn1d_sp/final.mdl | ||
src_lang=$srcdir/data/lang | ||
src_gmm_mdl=$srcdir/exp/tri4b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why call this _mdl if it's a directory? It should be src_gmm_dir.
# while training new added layer using rm data. | ||
# The chain config is as run_tdnn_5n.sh and the result is: | ||
# System tdnn_5n tdnn_wsj_rm_1a tdnn_wsj_rm_1b tdnn_wsj_rm_1c | ||
# WER 2.71 2.09 3.45 3.38 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should explain more about why you added the 1b and 1c scripts. I can see that the lexicon in this case may have words without pronunciations, since you use WSJ's lexicon. I guess you added these to demonstrate different ways to do it. But you're not clearly explaining what the difference is.
You say it uses " a src-tree-dir to generate new target alignment and lattices using source model".
But it seems to me a key difference that you didn't mention is that it uses the lexicon from WSJ. I'd like a clear explanation of what 1c is as well.
utils/create_split_dir.pl \ | ||
/export/b0{3,4,5,6}/$USER/kaldi-data/egs/rm-$(date +'%m_%d_%H_%M')/s5c/$dir/egs/storage $dir/egs/storage | ||
fi | ||
echo "$0: set the learning-rate-factor for initial network to be zero." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this 'zero' in the echo statement is not right. also I don't see that an echo statement is helpful here, you can use a comment.
/export/b0{3,4,5,6}/$USER/kaldi-data/egs/rm-$(date +'%m_%d_%H_%M')/s5c/$dir/egs/storage $dir/egs/storage | ||
fi | ||
echo "$0: set the learning-rate-factor for initial network to be zero." | ||
nnet3-copy --edits="set-learning-rate-factor name=* learning-rate-factor=$primary_lr_factor" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see what the point of the primary_lr_factor is here-- if you are setting a learning-rate factor for all layers, IMO it would make more sense to just multiply the initial and final learning rates by that factor..
@@ -20,32 +21,64 @@ | |||
logger.addHandler(logging.NullHandler()) | |||
|
|||
|
|||
def create_phone_lm(dir, tree_dir, run_opts, lm_opts=None): | |||
"""Create a phone LM for chain training | |||
def create_phone_lm(dir, tree_dir, run_opts, alignment_dirs=None, lm_opts=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert your changes to this function and create another version of this function that takes its input from multiple alignment dirs. This will mean that we don't need to so carefully test whether this will break the normal code path; it also keeps the individual functions simple.
Also revert your changes to chain-est-phone-lm; what you need to do can be implemented much more simply.
You can just run a command for each source alignment dir that uses ali-to-phones to dump the phone sequences into a single gzipped archive for that source. And then you can combine the gzipped archives while repeating some of them the requisite number of times, e.g. "gunzip -c $dir/src1_phones.gz $dir/src2_phones.gz $dir/src2_phones.gz|", assuming the second source was repeated twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vimal pointed out that this might fail due to argument list too long. But you can work around that by doing:
for x in [arg_list]; do gunzip -c $x; done | ...
I don't like the C++ code changes because having a scale that is required to be an integer is kind of weird, and because IMO it's just more elegant to repeat the input that many times, since that is what the code does. And also because of the principle that the "unusual case" shouldn't cause code to be modified in things used in the "common case". I want to keep the common case simple, which makes the codebase easier to maintain.
"{0}/ref.raw".format(config_dir)) | ||
def add_nnet_context_info(config_dir, existing_model=None, | ||
edits_config=None): | ||
"""This will be removed when python script refactoring is done.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like you have resolved a git conflict wrongly, this comment is an old comment that we removed.
right_context_final = (args.chunk_right_context_final + model_right_context if | ||
args.chunk_right_context_final >= 0 else -1) | ||
|
||
# Initialize as "raw" nnet, prior to training the LDA-like preconditioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment seems misplaced.
"creating does not have an output node called 'output' " | ||
"(e.g. for multilingual setups). You can set this to " | ||
"an edit-string like: " | ||
"an edits-config can contain string like: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment still doesn't really explain what the option does. Explaining what an option is useful for is not a substitute for saying what it does.
Also you should say edits-config filename
not config filename
, because they are different things.
But I think you should just remove this option completely, because I think this option is based on a misunderstanding. I think you didn't realize that it's possible to overwrite nodes and components by creating a new node with the old one's name. Just, if an older layer was of a different type (e.g. LSTM vs. TDNN), there may be confusingly named 'orphone' components and component-nodes left around, so it may make sense to do 'remove-orphans' in this situation. You could have this code do remove-orphans automatically in the case when you provide an initial nnet to this script.
model) | ||
out = common_lib.get_command_stdout('nnet3-info "{0}" | head -n 4 ' | ||
.format(model)) | ||
if edits_config is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole code block is unreadable and very unclear. And It doesn't look correct to me either (the - "
at the beginning).
i.e. layer in *.mdl -> corresponding 'XconfigExistingLayer' layer | ||
'input-node name=ivector dim=100' -> | ||
'existing name=ivector dim=100' | ||
'component-node name=tdnn1.affine ** input-node=1000 ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input-node -> input-dim and output-node -> output-dim
Update xconfig_to_configs.py
Update make_weighted_den_fst.sh
class XconfigExistingLayer(XconfigLayerBase): | ||
""" | ||
This class is for lines like | ||
'existing name=tdnn1.affine dim=40'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documentation is not meaningful. Where I wrote:
This class is for lines like
xxxxxx
the line xxxxx would appear in an xconfig file. But these wouldn't appear in an xconfig file.
I don't want to have to give you the documentation word for word. Figure it out.
echo "e.g.: $0 exp/tri1_ali exp/tri2_ali exp/chain/tdnn_1a_sp"; | ||
echo "Options: " | ||
echo " --cmd (run.pl|queue.pl...) # specify how to run the sub-processes."; | ||
echo "--lm-opts # options for phone LM generation"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please document the 'weights' here. It's important so it should go in the usage message as well as in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more mostly cosmetic comments.
if [ -z $weights ]; then | ||
# If 'weights' is not specified, comma-separated array '1' with dim | ||
#'num_alignments' is defined as 'weights'. | ||
for n in `seq 1 $num_alignments`;do weights="$weights,1"; done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't look right, there would be a leading comma.
And it's not ignored by cut:
echo ,a,b,c | cut -d, -f1
mac:nnet3: echo ,a,b,c | cut -d, -f2
a
That means this code path was not tested!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pegahgh Since you are already creating w_arr below, you can use w_arr = (); and then for num_alignment times do w_arr += (1.0). Then in line 90, you can use w=${w_arr[$n]}
rm $adir/alignment_files.txt 2>/dev/null || true | ||
for x in `seq $w`;do | ||
for j in `seq $num_jobs`;do | ||
echo $adir/ali.$j.gz >> $adir/alignment_files.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are making this more complicated than it needs to be by writing to separate files. Write them all to the same file, $dir/alignment_files.txt. Then you don't need a loop in the $cmd script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need loop, since we need different final.mdl for "ali-to-phones $adir/final.mdl"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the complexity of this command is underestimated, and this is being made more difficult that it previously was.
@pegahgh it is better to write the alignments in $dir/alignment.$n.txt instead of inside $adir, since you may not have write access to that. Also there would be race-conditions if multiple experiments are run using the same alignment directories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also write model in the file in a same line as alignment is written. Then we can have single file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will still be one loop inside the command. There needs to be more escaping to split each line into model and alignment, which might again make it more complicated.
src/nnet3/nnet-utils.cc
Outdated
} | ||
} | ||
KALDI_LOG << "Set learning rate factors for " << num_learning_rate_factors_set | ||
<< " nodes."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change 'nodes' to 'components' since the learning rate factor applies to components, not nodes.
src/nnet3/nnet-utils.h
Outdated
@@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config, | |||
Note: this sets the 'underlying' learning rate, i.e. it will get | |||
multiplied by any 'learning-rate-factor' set in the nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes -> components
src/nnet3/nnet-utils.h
Outdated
@@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config, | |||
Note: this sets the 'underlying' learning rate, i.e. it will get | |||
multiplied by any 'learning-rate-factor' set in the nodes. | |||
|
|||
set-learning-rate-factor [name=<name-pattern>] learning-rate-factor=<learning-rate-factor> | |||
Sets the learning rate factor for any updatable nodes matching the name pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes -> components
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree! Thanks, also probably I should change this issue for set-learning-rate. Do you agree?
Oh, OK as-is then.
…On Tue, Sep 5, 2017 at 3:06 PM, Vimal Manohar ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/wsj/s5/steps/nnet3/chain/make_weighted_den_fst.sh
<#1633 (comment)>:
> + echo "number of alignment directories, $num_alignments." && exit 1;
+ fi
+fi
+
+if [ $stage -le 1 ]; then
+ for n in `seq 0 $[num_alignments-1]`; do
+ w=$(echo $weights | cut -d, -f$[$n+1])
+ adir=${ali_dirs[$n]}
+ num_jobs=$(cat $adir/num_jobs)
+ if ! [[ $w =~ ^[+]?[0-9]+$ ]] ; then
+ echo "no positive int weight specified for alignment ${ali_dirs[$n]}" && exit 1;
+ fi
+ rm $adir/alignment_files.txt 2>/dev/null || true
+ for x in `seq $w`;do
+ for j in `seq $num_jobs`;do
+ echo $adir/ali.$j.gz >> $adir/alignment_files.txt
There will still be one loop inside the command. There needs to be more
escaping to split each line into model and alignment, which might again
make it more complicated.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu3Koy8L56VSROj7K6jsHSX6GRNThks5sfcX7gaJpZM4Nf6J5>
.
|
yes.
…On Tue, Sep 5, 2017 at 3:15 PM, pegahgh ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/nnet3/nnet-utils.h
<#1633 (comment)>:
> @@ -283,6 +283,9 @@ void CollapseModel(const CollapseModelConfig &config,
Note: this sets the 'underlying' learning rate, i.e. it will get
multiplied by any 'learning-rate-factor' set in the nodes.
+ set-learning-rate-factor [name=<name-pattern>] learning-rate-factor=<learning-rate-factor>
+ Sets the learning rate factor for any updatable nodes matching the name pattern.
I agree! Thanks, also probably I should change this issue for
set-learning-rate. Do you agree?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu8Y9YxZWItE09YIoyKlSaFxqTnROks5sfcf8gaJpZM4Nf6J5>
.
|
Update make_weighted_den_fst.sh
* 'master' of https://github.com/kaldi-asr/kaldi: (43 commits) [src,scripts,egs] Transfer learning for ASR with nnet3 (kaldi-asr#1633) [src,scripts,egs] Attention modeling, with example scripts (kaldi-asr#1731) [src] Fix bug in block matrix addition (thanks: Sidhi Adkoli). [egs] Fix inconseqential input-checking bug in Swbd example script (kaldi-asr#1886) [build] dependency-check: that python2.7 and python3 exist and 2.7 is default (kaldi-asr#1876) [scripts] A cosmetic change to info messages in chain training (kaldi-asr#1880) [doc] Keep tutorial code up to date (thanks: Luwei Yang) [scripts] Bug-fix in long-utterance-segmentation script (thanks: Armin Oliya) (kaldi-asr#1877) [egs] Fixed some issues in the multilingual BABEL example scripts (kaldi-asr#1850) [build] Cosmetic fix in Makefile Remove memory leaks and unused variables (when CUDA is not enabled) (kaldi-asr#1866) [scripts] Fix default for egs.cmd in nnet3 training scripts (kaldi-asr#1865) [doc] Fix to how documentation is built (thanks: David van Leeuwen) [scripts] Add --decode-extra-opts in steps/decode.sh (required for speech activity detection scripts) (kaldi-asr#1859) [src] Adding documentation for lattice discriminative training functions (kaldi-asr#1854) [src] Typo fixes in documenation. (kaldi-asr#1857) [egs] Update to score.sh in fisher_swbd setup, allow --iter option (kaldi-asr#1853) [scripts] bug-fix in TFRNNLM rescoring script (no 'ark' needed for unk.probs file) (kaldi-asr#1851) [src] Remove repeated parameter documentation. (kaldi-asr#1849) [egs] Aspire example scripts: Update autoencoder example to xconfig (kaldi-asr#1847) ...
This is the Transfer learning setup which transfers the model trained on wsj to rm.
The preliminary result without any tuning shows improvement(2.71 -> 2.21)