nnet2 to nnet3 model converter #886

vijayaditya · 2016-07-07T17:26:27Z

There have been multiple requests for a tool to convert nnet2 models into nnet3 models.
There is no such tool currently available. However it would be simple to write and does not even need accessing the C++ parts of the nnet2 or nnet3 code.

The procedure is

Convert nnet2 model to text format using nnet-am-copy
Isolate the parameter matrices corresponding to components (e.g. NaturalGradientAffineComponent) into separate files
Use these files to initialize the components in the nnet3 model using the file-name argument in the config string. For this write a nnet3 config which replicates the nnet2 model architecture and nnet3-am-init to initialize this model and write it out.

This would be a simple python script involving some string parsing.

The text was updated successfully, but these errors were encountered:

jfainberg · 2016-08-26T09:19:32Z

I'll start working on this (unless anyone else already is).

vijayaditya · 2016-08-26T16:17:16Z

Please go ahead, no one has taken this yet.
Btw it would be helpful if you discuss your algorithm before implementation
as some changes can be a bit involved e.g. SpliceComponent to Descriptor
conversion.

Vijay

On Aug 26, 2016 05:19, "Joachim" notifications@github.com wrote:

I'll start working on this (unless anyone else already is).

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#886 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADtwoI5i9u-YxAa4hs1POkdfPkDc5T6Rks5qjq-ogaJpZM4JHUA_
.

jfainberg · 2016-09-06T14:32:43Z

Ok, this is getting there and I'll make WIP PR soon once it's a bit cleaned up.

I have one question: when initialising affine components with nnet3-init, it expects a matrix with column dim = input_dim + 1 (bias). The isolated parameter matrices from the nnet2 model are split into LinearParameters and Bias. I could either use something like paste to manually append the bias to the linear parameters, or modify ::InitFromConfig to take either matrix or linear_params and bias separately. The first is a bit of a hack, but perhaps ok for the intended use of this?

danpovey · 2016-09-06T18:09:44Z

Either way is OK with me.

On Tue, Sep 6, 2016 at 10:32 AM, Joachim notifications@github.com wrote:

Ok, this is getting there and I'll make WIP PR soon once it's a bit
cleaned up.

I have one question: when initialising affine components with nnet3-init,
it expects a matrix with column dim = input_dim + 1 (bias). The isolated
parameter matrices from the nnet2 model are split into and . I could either
use something like paste to manually append the bias to the linear
parameters, or modify ::InitFromConfig to take either matrix or
linear_params and bias separately. The first is a bit of a hack, but
perhaps ok for the intended use of this?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#886 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu0Y933zseAiU30gQP0YMofa-AcdMks5qnXmOgaJpZM4JHUA_
.

yujunlhz · 2017-04-19T19:20:24Z

Dear Kaldi developers:

I'd like to know whether there is any update for this thread?

thanks

danpovey · 2017-04-19T19:23:04Z

I am not aware of any update but it would be nice of Joachim comes through with this. Actually this is not a priority for me at all because as time goes on, fewer people will be using nnet2 models.

…

On Wed, Apr 19, 2017 at 12:20 PM, yujunlhz ***@***.***> wrote: Dear Kaldi developers: I'd like to know whether there is any update for this thread? thanks — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#886 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu4ztE68YdO8QQujKBlBLpjS9UxDSks5rxl56gaJpZM4JHUA_> .

jfainberg · 2017-04-19T19:28:02Z

Sorry, other things took priority and I forgot to come back to this. I have done some work on it; I believe it was nearly ready. I'll have a look at it the coming weekend and make a PR.

KnowBetterHelps · 2018-06-13T01:40:30Z

I use convert_nnet2_to_nnet3.py script in kaldi-master/egs/librispeech/s5 to transfrom my nnet2 models to nnet3, and the model came from /trunk/egs/librispeech/s5/exp/nnet2_online/nnet_a/final.mdl in http://kaldi-asr.org/downloads/build/6/trunk/egs/librispeech/s5/exp/nnet2_online/nnet_a/. But I got the following error:
ERROR (nnet3-init[5.4]:Check():nnet-nnet.cc:737) Dimension mismatch for network-node fixed-affine1: input-dim 2100 versus component-input-dim 700

seems that the feature dim is not matched, but how can I fixed this problem?
Any reply from you would be great appreciate!

danpovey · 2018-06-13T02:01:11Z

There is not a lot of reason to use that script-- we may have created it for testing purposes. If you have a nnet2 model it will generally be easier to just use nnet2 tools with it. Dan

…

On Tue, Jun 12, 2018 at 9:40 PM, hyuezhi ***@***.***> wrote: I use convert_nnet2_to_nnet3.py script in *kaldi-master/egs/librispeech/s5* to transfrom my nnet2 models to nnet3, and the model came from */trunk/egs/librispeech/s5/exp/nnet2_online/nnet_a/final.mdl* in *http://kaldi-asr.org/downloads/build/6/trunk/egs/librispeech/s5/exp/nnet2_online/nnet_a/ <http://kaldi-asr.org/downloads/build/6/trunk/egs/librispeech/s5/exp/nnet2_online/nnet_a/>*. But I got the following error: *ERROR (nnet3-init[5.4]:Check():nnet-nnet.cc:737) Dimension mismatch for network-node fixed-affine1: input-dim 2100 versus component-input-dim 700* seems that the feature dim is not matched, but how can I fixed this problem? Any reply from you would be great appreciate! — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#886 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu82cjZbmwHSD-5emvEu4he9a-ZJzks5t8G2TgaJpZM4JHUA_> .

KnowBetterHelps · 2018-06-13T02:27:30Z

Thanks for your answer!

but maybe I have the reasons to use the nnet3 model (https://github.com/tbright17/kaldi-dnn-ali-gop), or I got the error like this:
ERROR (compute-dnn-gop[5.4]:ExpectToken():io-funcs.cc:212) Expected token "", got instead ""
more information: compute-dnn-gop --use-gpu=no exp_dnn/gop/tree exp_dnn/gop/final.mdl data_dnn/lang/L.fst 'ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data_dnn/eval/split1/1/utt2spk scp:data_dnn/eval/split1/1/cmvn.scp scp:data_dnn/eval/split1/1/feats.scp ark:- |' scp:data_dnn/eval/ivectors_hires/ivector_online.scp 'ark:utils/sym2int.pl --map-oov 2 -f 2- data_dnn/lang/words.txt data_dnn/eval/split1/1/text|' ark,t:exp_dnn/gop/gop.1 ark,t:exp_dnn/gop/align.1 ark,t:exp_dnn/gop/phoneme_ll.1

danpovey · 2018-06-13T02:33:05Z

Maybe you can ask those people to make available the model they used the tool with.

…

On Tue, Jun 12, 2018 at 10:27 PM, hyuezhi ***@***.***> wrote: Thanks for your answer! but maybe I have the reasons to use the nnet3 model (*https://github.com/tbright17/kaldi-dnn-ali-gop <https://github.com/tbright17/kaldi-dnn-ali-gop>*), or I got the error like this: *ERROR* (compute-dnn-gop[5.4]:ExpectToken():io-funcs.cc:212) Expected token "", got instead "" *more information: compute-dnn-gop --use-gpu=no exp_dnn/gop/tree exp_dnn/gop/final.mdl data_dnn/lang/L.fst 'ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data_dnn/eval/split1/1/utt2spk scp:data_dnn/eval/split1/1/cmvn.scp scp:data_dnn/eval/split1/1/feats.scp ark:- |' scp:data_dnn/eval/ivectors_hires/ivector_online.scp 'ark:utils/sym2int.pl <http://sym2int.pl> --map-oov 2 -f 2- data_dnn/lang/words.txt data_dnn/eval/split1/1/text|' ark,t:exp_dnn/gop/gop.1 ark,t:exp_dnn/gop/align.1 ark,t:exp_dnn/gop/phoneme_ll.1* — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#886 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu2c0HEF6Nut6t4ZmPFqY1t5_QL7Lks5t8HiXgaJpZM4JHUA_> .

KnowBetterHelps · 2018-06-13T03:00:56Z

ok, thanks!

ltbringer · 2019-03-01T10:26:16Z

I tried using the script convert_nnet2_to_nnet3.py like so:

python steps/nnet3/convert_nnet2_to_nnet3.py /home/app/models/english/tedlium_nnet_ms_sp_online /home/app/models/english_nnet3 

steps/nnet3/convert_nnet2_to_nnet3.py /home/app/models/english/tedlium_nnet_ms_sp_online /home/app/models/english_nnet3
2019-03-01 08:35:54,642 [convert_nnet2_to_nnet3.py:442 - Main - INFO ] Converting nnet2 model /home/app/models/english/tedlium_nnet_ms_sp_online/final.mdl to nnet3 model /home/app/models/english_nnet3/final.mdl
nnet-am-copy --remove-preconditioning=true --binary=false /home/app/models/english/tedlium_nnet_ms_sp_online/final.mdl ./tmpq_8B4X/final.mdl 
LOG (nnet-am-copy[5.5.211~1419-5f05d]:main():nnet-am-copy.cc:207) Copied neural net from /home/app/models/english/tedlium_nnet_ms_sp_online/final.mdl to ./tmpq_8B4X/final.mdl
2019-03-01 08:36:18,467 [convert_nnet2_to_nnet3.py:147 - write_config - INFO ] Writing config to ./tmpq_8B4X/config

2019-03-01 08:36:18,468 [convert_nnet2_to_nnet3.py:174 - write_config - WARNING ] Assuming linear objective.
nnet3-init --binary=true ./tmpq_8B4X/config ./tmpq_8B4X/nnet3.raw 
ERROR (nnet3-init[5.5.211~1419-5f05d]:Check():nnet-nnet.cc:737) Dimension mismatch for network-
node fixed-affine1: 
input-dim 700 versus component-input-dim 300

Would appreciate any help to solve this

jfainberg · 2019-03-01T10:46:10Z

@AmreshVenugopal , would you be able to share the nnet2 final.mdl with me? I'd be happy to debug this on Monday (but have limited time until then).

ltbringer · 2019-03-01T11:01:28Z

Thanks for replying @jfainberg. Sure thing! I'll try to put it up somewhere and share the link as soon as it is up.

ltbringer · 2019-03-02T07:39:45Z

@jfainberg I have added the nnet2 final.mdl and other things that may or may not be of use here.

Kirito0816 · 2019-04-24T18:45:11Z

Hello, I meet a problem with converter too. It is :
steps/nnet3/convert_nnet2_to_nnet3.py exp/tri4_nnet exp/tri5_nnet
2019-04-25 02:13:51,478 [convert_nnet2_to_nnet3.py:441 - Main - INFO ] Converting nnet2 model exp/tri4_nnet/final.mdl to nnet3 model exp/tri5_nnet/final.mdl
/bin/sh: 1: nnet-am-copy: not found
Traceback (most recent call last):
File "steps/nnet3/convert_nnet2_to_nnet3.py", line 468, in
Main()
File "steps/nnet3/convert_nnet2_to_nnet3.py", line 448, in Main
.format(args.nnet2_dir, args.model, tmpdir))
File "steps/libs/common.py", line 157, in execute_command
p.returncode, command))
Exception: Command exited with status 127: nnet-am-copy --remove-preconditioning=true --binary=false exp/tri4_nnet/final.mdl ./tmp_V4bPJ/final.mdl
I found the nnet-am-copy file under the directory kaldi/src/nnet2bin, but I don''t know why it is not found. Thank you.

danpovey · 2019-04-24T18:46:16Z

It's a path issue, look for a tutorial on UNIX paths and you'll understand.

Kirito0816 · 2019-04-24T19:08:14Z

@danpovey Thank you. It seems that I did not add the path of kaldi into PATH variable, right? How can I add the path? like, export KALDI =kaldi's path ?

danpovey · 2019-04-24T19:12:38Z

If I wanted to tell you I would have told you. Look it up.

…

On Wed, Apr 24, 2019 at 3:08 PM Kirito0816 ***@***.***> wrote: @danpovey <https://github.com/danpovey> Thank you. It seems that I did not add the path of kaldi into PATH variable, right? How can I add the path? like, export KALDI =kaldi's path ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#886 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZFLO7L6CSHEQ7RFDFV5QTPSCV2NANCNFSM4CI5IA7Q> .

@cbtpkzm

* [build] Allow configure script to handle package-based OpenBLAS (kaldi-asr#2618) * [egs] updating local/make_voxceleb1.pl so that it works with newer versions of VoxCeleb1 (kaldi-asr#2684) * [egs,scripts] Remove unused --nj option from some scripts (kaldi-asr#2679) * [egs] Fix to tedlium v3 run.sh (rnnlm rescoring) (kaldi-asr#2686) * [scripts,egs] Tamil OCR with training data from yomdle and testing data from slam (kaldi-asr#2621) note: this data may not be publicly available at the moment. we'll work on that. * [egs] mini_librispeech: allow relative pathnames in download_and_untar.sh (kaldi-asr#2689) * [egs] Updating SITW recipe to account for changes to VoxCeleb1 (kaldi-asr#2690) * [src] Fix nnet1 proj-lstm bug where gradient clipping not used; thx:@cbtpkzm (kaldi-asr#2696) * [egs] Update aishell2 recipe to allow online decoding (no pitch for ivector) (kaldi-asr#2698) * [src] Make cublas and cusparse use per-thread streams. (kaldi-asr#2692) This will reduce synchronization overhead when we actually use multiple cuda devices in one process go down drastically, since we no longer synchronize on the legacy default stream. More details here: https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html * [src] improve handling of low-rank covariance in ivector-compute-lda (kaldi-asr#2693) * [egs] Changes to IAM handwriting-recognition recipe, including BPE encoding (kaldi-asr#2658) * [scripts] Make sure pitch is not included in i-vector feats, in online decoding preparation (kaldi-asr#2699) * [src] fix help message in post-to-smat (kaldi-asr#2703) * [scripts] Fix to steps/cleanup/debug_lexicon.sh (kaldi-asr#2704) * [egs] Cosmetic and file-mode fixes in HKUST recipe (kaldi-asr#2708) * [scripts] nnet1: remove the log-print of args in 'make_nnet_proto.py', thx:mythilisharan@gmail.com (kaldi-asr#2706) * [egs] update README in AISHELL-2 (kaldi-asr#2710) * [src] Make constructor of CuDevice private (kaldi-asr#2711) * [egs] fix sorting issue in aishell v1 (kaldi-asr#2705) * [egs] Add soft links for CNN+TDNN scripts (kaldi-asr#2715) * [build] Add missing packages in extras/check_dependencies.sh (kaldi-asr#2719) * [egs] madcat arabic: clean scripts, tuning, use 6-gram LM (kaldi-asr#2718) * [egs] Update WSJ run.sh: comment out outdated things, add run_tdnn.sh. (kaldi-asr#2723) * [scripts,src] Fix potential issue in scripts; minor fixes. (kaldi-asr#2724) The use of split() in latin-1 encoding (which might be used for other ASCII-compatible encoded data like utf-8) is not right because character 160 (expressed here in decimal) is a NBSP in latin-8 encoding and is also in the range UTF-8 uses for encoding. The same goes for strip(). Thanks @ChunChiehChang for finding the issue. * [egs] add example script for RNNLM lattice rescoring for WSJ recipe (kaldi-asr#2727) * [egs] add rnnlm example on tedlium+lm1b; add rnnlm rescoring results (kaldi-asr#2248) * [scripts] Small fix to utils/data/convert_data_dir_to_whole.sh (RE backups) (kaldi-asr#2735) * [src] fix memory bug in kaldi::~LatticeFasterDecoderTpl(), (kaldi-asr#2737) - found it when running 'latgen-faster-mapped-parallel', - core-dumps from the line: decoder/lattice-faster-decoder.cc:52 -- the line is doing 'delete &(FST*)', i.e. deleting the pointer to FST, instead of deleting the FST itslef, -- bug was probably introduced by refactoring commit d0c68a6 from 2018-09-01, -- after the change the code runs fine... (the unit tests for src/decoder are missing) * [egs] Remove per-utt option from nnet3/align scripts (kaldi-asr#2717) * [egs] Small Librispeech example fix, thanks: Yasasa Tennakoon. (kaldi-asr#2738) * [egs] Aishell2 recipe: turn off jieba's new word discovery in word segmentation (kaldi-asr#2740) * [egs] Add missing file local/join_suffix.py in TEDLIUM s5_r3; thx:anand@sayint.ai (kaldi-asr#2741) * [egs,scripts] Add Tunisian Arabic (MSA) recipe; cosmetic fixes to pbs.pl (kaldi-asr#2725) * [scripts] Fix missing import in utils/langs/grammar/augment_words_txt.py (kaldi-asr#2742) * [scripts] Fix build_const_arpa_lm.sh w.r.t. where <s> appears inside words (kaldi-asr#2745) * [scripts] Slight improvements to decode_score_fusion.sh usability (kaldi-asr#2746) * [build] update configure to support cuda 10 (kaldi-asr#2747) * [scripts] Fix bug in utils/data/resample_data_dir.sh (kaldi-asr#2749) * [scripts] Fix bug in cleanup after steps/cleanup/clean_and_segment_data*.sh (kaldi-asr#2750) * [egs] several updates of the tunisian_msa recipe (kaldi-asr#2752) * [egs] Small fix to Tunisian MSA TDNN script (RE train_stage) (kaldi-asr#2757) * [src,scripts] Batched nnet3 computation (kaldi-asr#2726) This PR adds the underlying utilities for much faster nnet3 inference on GPU, and a command-line binary (and script support) for nnet3 decoding and posterior computation. TBD: a binary for x-vector computation. This PR also contains unrelated decoder speedups (skipping range checks for transition ids... this may cause segfaults when graphs are mismatched). * [build] Add python3 compatibility to install scripts (kaldi-asr#2748) * [scripts] tfrnnlm: Modify TensorFlow flag format for compatibility with recent versions (kaldi-asr#2760) * [egs] fix old style perl regex in egs/chime1/s5/local/chime1_prepare_data.sh (kaldi-asr#2762) * [scripts] Fix bug in steps/cleanup/debug_lexicon.sh (kaldi-asr#2763) * [egs] Add example for Yomdle Farsi OCR (kaldi-asr#2702) * [scripts] debug_lexicon.sh: Fix bug introduced in kaldi-asr#2763. (kaldi-asr#2764) * [egs] add missing online cmvn config in aishell2 (kaldi-asr#2767) * [egs] Add CNN-TDNN-F script for Librispeech (kaldi-asr#2744) * [src] Some minor cleanup/fixes regarding CUDA memory allocation; other small fixes. (kaldi-asr#2768) * [scripts] Update reverberate_data_dir.py so that it works with python3 (kaldi-asr#2771) * [egs] Chime5: fix total number of words for WER calculation (kaldi-asr#2772) * [egs] RNNLMs on Tedlium w/ Google 1Bword: Increase epochs, update results (kaldi-asr#2775) * [scripts,egs] Added phonetisaurus-based g2p scripts (kaldi-asr#2730) Phonetisaurus is much faster to train then sequitur. * [egs] madcat arabic: clean scripts, tuning, rescoring, text localization (kaldi-asr#2716) * [scripts] Enhancements & minor bugfix to segmentation postprocessing (kaldi-asr#2776) * [src] Update gmm-decode-simple to accept ConstFst (kaldi-asr#2787) * [scripts] Update documentation of train_raw_dnn.py (kaldi-asr#2785) * [src] nnet3: extend what descriptors can be parsed. (kaldi-asr#2780) * [src] Small fix to 'fstrand' (make sure args are parsed) (kaldi-asr#2777) * [src,scripts] Minor, mostly cosmetic updates (kaldi-asr#2788) * [src,scripts] Add script to compare alignment directories. (kaldi-asr#2765) * [scripts] Small fixes to script usage messages, etc. (kaldi-asr#2789) * [egs] Update ami_download.sh after changes on Edinburgh website. (kaldi-asr#2769) * [scripts] Update compare_alignments.sh to allow different lang dirs. (kaldi-asr#2792) * [scripts] Change make_rttm.py so output is in determinstic order (kaldi-asr#2794) * [egs] Fixes to yomdle_zh RE encoding direction, etc. (kaldi-asr#2791) * [src] Add support for context independent phones in gmm-init-biphone (for e2e) (kaldi-asr#2779) * [egs] Simplifying multi-condition version of AMI recipe (kaldi-asr#2800) * [build] Fix openblas build for aarch64 (kaldi-asr#2806) * [build] Make CUDA_ARCH configurable at configure-script level (kaldi-asr#2807) * [src] Print maximum memory stats in CUDA allocator (kaldi-asr#2799) * [src,scripts] Various minor code cleanups (kaldi-asr#2809) * [scripts] Fix handling of UTF-8 in filenames, in wer_per_spk_details.pl (kaldi-asr#2811) * [egs] Update AMI chain recipes (kaldi-asr#2817) * [egs] Improvements to multi_en tdnn-opgru/lstm recipes (kaldi-asr#2824) * [scripts] Fix initial prob of silence when lexicon has silprobs. Thx:@agurianov (kaldi-asr#2823) * [scripts,src] Fix to multitask nnet3 training (kaldi-asr#2818); cosmetic code change. (kaldi-asr#2827) * [scripts] Create shared versions of get_ctm_conf.sh, add get_ctm_conf_fast.sh (kaldi-asr#2828) * [src] Use cuda streams in matrix library (kaldi-asr#2821) * [egs] Add online-decoding recipe to aishell1 (kaldi-asr#2829) * [egs] Add DIHARD 2018 diarization recipe. (kaldi-asr#2822) * [egs] add nnet3 online result for aishell1 (kaldi-asr#2836) * [scripts] RNNLM scripts: don't die when features.txt is not present (kaldi-asr#2837) * [src] Optimize cuda allocator for multi-threaded case (kaldi-asr#2820) * [build] Add cub library for cuda projects (kaldi-asr#2819) not needed now but will be in future. * [src] Make Cuda allocator statistics visible to program (kaldi-asr#2835) * [src] Fix bug affecting scale in GeneralDropoutComponent (non-continuous case) (kaldi-asr#2815) * [build] FIX kaldi-asr#2842: properly check $use_cuda against false. (kaldi-asr#2843) * [doc] Add note about OOVs to data-prep. (kaldi-asr#2844) * [scripts] Allow segmentation with nnet3 chain models (kaldi-asr#2845) * [build] Remove -lcuda from cuda makefiles which breaks operation when no driver present (kaldi-asr#2851) * [scripts] Fix error in analyze_lats.sh for long lattices (replace awk with perl) (kaldi-asr#2854) * [egs] add rnnlm recipe for librispeech (kaldi-asr#2830) * [build] change configure version from 9 to 10 (kaldi-asr#2853) (kaldi-asr#2855) * [src] fixed compilation errors when built with --DOUBLE_PRECISION=1 (kaldi-asr#2856) * [build] Clarify instructions if cub is not found (kaldi-asr#2858) * [egs] Limit MFCC feature extraction job number in Dihard recipe (kaldi-asr#2865) * [egs] Added Bentham handwriting recognition recipe (kaldi-asr#2846) * [src] Share roots of different tones of phones aishell (kaldi-asr#2859) * [egs] Fix path to sequitur in commonvoice egs (kaldi-asr#2868) * [egs] Update reverb recipe (kaldi-asr#2753) * [scripts] Fix error while analyzing lattice (parsing bugs) (kaldi-asr#2873) * [src] Fix memory leak in OnlineCacheFeature; thanks @Worldexe (kaldi-asr#2872) * [egs] TIMIT: fix mac compatibility of sed command (kaldi-asr#2874) * [egs] mini_librispeech: fixing some bugs and limiting repeated downloads (kaldi-asr#2861) * [src,scripts,egs] Speedups to GRU-based networks (special components) (kaldi-asr#2712) * [src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876) * Revert "[src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876)" (kaldi-asr#2877) This reverts commit 84435ff. * Revert "Revert "[src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876)" (kaldi-asr#2877)" (kaldi-asr#2878) This reverts commit b196b7f. * Revert "[src] Fix memory leak in OnlineCacheFeature; thanks @Worldexe" (kaldi-asr#2882) the fix was buggy. apologies. * [src] Remove unused code that caused Windows compile failure. Thx:@btiplitz (kaldi-asr#2881) * [src] Really fix memory leak in online decoding; thx:@Worldexe (kaldi-asr#2883) * [src] Fix Windows cuda build failure (use C++11 standard include) (kaldi-asr#2880) * [src] Add #include that caused build failure on Windows (kaldi-asr#2886) * [scripts] Fix max duration check in sad_to_segments.py (kaldi-asr#2889) * [scripts] Fix speech duration calculation in sad_to_segments.py (kaldi-asr#2891) * [src] Fix Windows build problem (timer.h) (kaldi-asr#2888) * [egs] add HUB4 spanish tdnn-f and cnn-tdnn script (kaldi-asr#2895) * [egs] Fix Aishell2 dict prepare bug; should not affect results (kaldi-asr#2890) * [egs] Self-contained example for KWS for mini_librispeech (kaldi-asr#2887) * [egs,scripts] Fix bugs in Dihard 2018 (kaldi-asr#2897) * [scripts] Check last character of files to match with newline (kaldi-asr#2898) * [egs] Update Librispeech RNNLM results; use correct training data (kaldi-asr#2900) * [scripts] RNNLM: old iteration model cleanup; save space (kaldi-asr#2885) * [scripts] Make prepare_lang.sh cleanup beforehand (prevents certain failures) (kaldi-asr#2906) * [scripts] Expose dim-range-node at xconfig level (kaldi-asr#2903) * [scripts] Fix bug related to multi-task in train_raw_rnn.py (kaldi-asr#2907) [scripts] Fix bug related to multi-task in train_raw_rnn.py. Thx:tessfu2001@gmail.com * [scripts] Cosmetic fix/clarification to utils/prepare_lang.sh (kaldi-asr#2912) * [scripts,egs] Added a new lexicon learning (adaptation) recipe for tedlium, in accordance with the IS17 paper. (kaldi-asr#2774) * [egs] TDNN+LSTM example scripts, with RNNLM, for Librispeech (kaldi-asr#2857) * [src] cosmetic fix in nnet1 code (kaldi-asr#2921) * [src] Fix incorrect invocation of mutex in nnet-batch-compute code (kaldi-asr#2932) * [egs,minor] Fix typo in comment in voxceleb script (kaldi-asr#2926) * [src,egs] Mostly cosmetic changes; add some missing includes (kaldi-asr#2936) * [egs] Fix path of rescoring binaries used in tfrnnlm scripts (kaldi-asr#2941) * [src] Fix bug in nnet3-latgen-faster-batch for determinize=false (kaldi-asr#2945) thx: Maxim Korenevsky. * [egs] Add example for rimes handwriting database; Madcat arabic script cleanup (kaldi-asr#2935) * [egs] Add scripts for yomdle korean (kaldi-asr#2942) * [build] Refactor/cleanup build system, easier build on ubuntu 18.04. (kaldi-asr#2947) note: if this breaks someone's build we'll have to debug it then. * [scripts,egs] Changes for Python 2/3 compatibility (kaldi-asr#2925) * [egs] Add more modern DNN recipe for fisher_callhome_spanish (kaldi-asr#2951) * [scripts] switch from bc to perl to reduce dependencies (diarization scripts) (kaldi-asr#2956) * [scripts] Further fix for Python 2/3 compatibility (kaldi-asr#2957) * [egs] Remove no-longer-existing option in tedlium_r3 recipe (kaldi-asr#2959) * [build] Handle dependencies for .cu files in addition to .cc files (kaldi-asr#2944) * [src] remove duplicate test mode option from class GeneralDropoutComponent (kaldi-asr#2960) * [egs] Fix minor bugs in WSJ's flat-start/e2e recipe (kaldi-asr#2968) * [egs] Fix to BSD compatibility of TIMIT data prep (kaldi-asr#2966) * [scripts] Fix RNNLM training script problem (chunk_length was ignored) (kaldi-asr#2969) * [src] Fix bug in lattice-1best.cc RE removing insertion penalty (kaldi-asr#2970) * [src] Compute a separate avg (start, end) interval for each sausage word (kaldi-asr#2972) * [build] Move nvcc verbose flag to proper location (kaldi-asr#2962) * [egs] Fix mini_librispeech download_lm.sh crash; thx:chris.keith.johnson@gmail.com (kaldi-asr#2974) * [egs] minor fixes related to python2 vs python3 differences (kaldi-asr#2977) * [src] Small fix in test code, avoid spurious failure (kaldi-asr#2978) * [egs] Fix CSJ data-prep; minor path fix for USB version of data (kaldi-asr#2979) * [egs] Add paper ref to README.txt in reverb example (kaldi-asr#2982) * [egs] Minor fixes to sitw recipe (fix problem introdueced in kaldi-asr#2925) (kaldi-asr#2985) * [scripts] Fix bug introduced in kaldi-asr#2957, RE integer division (kaldi-asr#2986) * [egs] Update WSJ flat-start chain recipes to use TDNN-F not TDNN+LSTM (kaldi-asr#2988) * [scripts] Fix typo introduced in kaldi-asr#2925 (kaldi-asr#2989) * [build] Modify Makefile and travis script to fix Travis failures (kaldi-asr#2987) * [src] Simplification and efficiency improvement in ivector-plda-scoring-dense (kaldi-asr#2991) * [egs] Update madcat Arabic and Chinese egs, IAM (kaldi-asr#2964) * [src] Fix overflow bug in convolution code (kaldi-asr#2992) * [src] Fix nan issue in ctm times introduced in kaldi-asr#2972, thx: @vesis84 (kaldi-asr#2993) * [src] Fix 'sausage-time' issue which occurs with disabled MBR decoding. (kaldi-asr#2996) * [egs] Add scripts for yomdle Russian (OCR task) (kaldi-asr#2953) * [egs] Simplify lexicon preparation in Fisher callhome Spanish (kaldi-asr#2999) * [egs] Update GALE Arabic recipe (kaldi-asr#2934) * [egs] Remove outdated NN results from Gale Arabic recipe (kaldi-asr#3002) * [egs] Add RESULTS file for the tedlium s5_r3 (release 3) setup (kaldi-asr#3003) * [src] Fixes to grammar-fst code to handle LM-disambig symbols properly (kaldi-asr#3000) thanks: armando.muscariello@gmail.com * [src] Cosmetic change to mel computation (fix option string) (kaldi-asr#3011) * [src] Fix Visual Studio error due to alternate syntactic form of noreturn (kaldi-asr#3018) * [egs] Fix location of sequitur installation (kaldi-asr#3017) * [src] Fix w/ ifdef Visual Studio error from alternate syntactic form noreturn (kaldi-asr#3020) * [egs] Some fixes to getting data in heroico recipe (kaldi-asr#3021) * [egs] BABEL script fix: avoid make_L_align.sh generating invalid files (kaldi-asr#3022) * [src] Fix to older online decoding code in online/ (OnlineFeInput; was broken by commit cc2469e). (kaldi-asr#3025) * [script] Fix unset bash variable in make_mfcc.sh (kaldi-asr#3030) * [scripts] Extend limit_num_gpus.sh to support --num-gpus 0. (kaldi-asr#3027) * [scripts] fix bug in utils/add_lex_disambig.pl when sil-probs and pron-probs used (kaldi-asr#3033) bug would likely have resulted in determinization failure (only when not using word-position-dependent phones). * [egs] Fix path in Tedlium r3 rnnlm training script (kaldi-asr#3039) * [src] Thread-safety for GrammarFst (thx:armando.muscariello@gmail.com) (kaldi-asr#3040) * [scripts] Cosmetic fix to get_degs.sh (kaldi-asr#3045) * [egs] Small bug fixes for IAM and UW3 recipes (kaldi-asr#3048) * [scripts] Nnet3 segmentation: fix default params (kaldi-asr#3051) * [scripts] Allow perturb_data_dir_speed.sh to work with utt2lang (kaldi-asr#3055) * [scripts] Make beam in monophone training configurable (kaldi-asr#3057) * [scripts] Allow reverberate_data_dir.py to support unicode filenames (kaldi-asr#3060) * [scripts] Make some cleanup scripts work with python3 (kaldi-asr#3054) * [scripts] bug fix to nnet2->3 conversion, fixes kaldi-asr#886 (kaldi-asr#3071) * [src] Make copies occur in per-thread default stream (for GPUs) (kaldi-asr#3068) * [src] Add GPU version of MergeTaskOutput().. relates to batch decoding (kaldi-asr#3067) * [src] Add device options to enable tensor core math mode. (kaldi-asr#3066) * [src] Log nnet3 computation to VLOG, not std::cout (kaldi-asr#3072) * [src] Allow upsampling in compute-mfcc-feats, etc. (kaldi-asr#3014) * [src] fix problem with rand_r being undefined on Android (kaldi-asr#3037) * [egs] Update swbd1_map_words.pl, fix them_1's -> them's (kaldi-asr#3052) * [src] Add const overload OnlineNnet2FeaturePipeline::IvectorFeature (kaldi-asr#3073) * [src] Fix syntax error in egs/bn_music_speech/v1/local/make_musan.py (kaldi-asr#3074) * [src] Memory optimization for online feature extraction of long recordings (kaldi-asr#3038) * [build] fixed a bug in linux_configure_redhat_fat when use_cuda=no (kaldi-asr#3075) * [scripts] Add missing '. ./path.sh' to get_utt2num_frames.sh (kaldi-asr#3076) * [src,scripts,egs] Add count-based biphone tree tying for flat-start chain training (kaldi-asr#3007) * [scripts,egs] Remove sed from various scripts (avoid compatibility problems) (kaldi-asr#2981) * [src] Rework error logging for safety and cleanliness (kaldi-asr#3064) * [src] Change warp-synchronous to cub::BlockReduce (safer but slower) (kaldi-asr#3080) * [src] Fix && and || uses where & and | intended, and other weird errors (kaldi-asr#3087) * [build] Some fixes to Makefiles (kaldi-asr#3088) clang is unhappy with '-rdynamic' in compile-only step, and the switch is really unnecessary. Also, the default location for MKL 64-bit libraries is intel64/. The em64t/ was explained already obsolete by an Intel rep in 2010: https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/285973 * [src] Fixed -Wreordered warnings in feat (kaldi-asr#3090) * [egs] Replace bc with perl -e (kaldi-asr#3093) * [scripts] Fix python3 compatibility issue in data-perturbing script (kaldi-asr#3084) * [doc] fix some typos in doc. (kaldi-asr#3097) * [build] Make sure expf() speed probe times sensibly (kaldi-asr#3089) * [scripts] Make sure merge_targets.py works in python3 (kaldi-asr#3094) * [src] ifdef to fix compilation failure on CUDA 8 and earlier (kaldi-asr#3103) * [doc] fix typos and broken links in doc. (kaldi-asr#3102) * [scripts] Fix frame_shift bug in egs/swbd/s5c/local/score_sclite_conf.sh (kaldi-asr#3104) * [src] Fix wrong assertion failure in nnet3-am-compute (kaldi-asr#3106) * [src] Cosmetic changes to natural-gradient code (kaldi-asr#3108) * [src,scripts] Python2 compatibility fixes and code cleanup for nnet1 (kaldi-asr#3113) * [doc] Small documentation fixes; update on Kaldi history (kaldi-asr#3031) * [src] Various mostly-cosmetic changes (copying from another branch) (kaldi-asr#3109) * [scripts] Simplify text encoding in RNNLM scripts (now only support utf-8) (kaldi-asr#3065) * [egs] Add "formosa_speech" recipe (Taiwanese Mandarin ASR) (kaldi-asr#2474) * [egs] python3 compatibility in csj example script (kaldi-asr#3123) * [egs] python3 compatibility in example scripts (kaldi-asr#3126) * [scripts] Bug-fix for removing deleted words (kaldi-asr#3116) The type of --max-deleted-words-kept-when-merging in segment_ctm_edits.py was a string, which prevented the mechanism from working altogether. * [scripts] Add fix regarding num-jobs for segment_long_utterances*.sh(kaldi-asr#3130) * [src] Enable allow_{upsample,downsample} with online features (kaldi-asr#3139) * [src] Fix bad assert in fstmakecontextsyms (kaldi-asr#3142) * [src] Fix to "Fixes to grammar-fst & LM-disambig symbols" (kaldi-asr#3000) (kaldi-asr#3143) * [build] Make sure PaUtils exported from portaudio (kaldi-asr#3144) * [src] cudamatrix: fixing a synchronization bug in 'normalize-per-row' (kaldi-asr#3145) was only apparent using large matrices * [src] Fix typo in comment (kaldi-asr#3147) * [src] Add binary that functions as a TCP server (kaldi-asr#2938) * [scripts] Fix bug in comment (kaldi-asr#3152) * [scripts] Fix bug in steps/segmentation/ali_to_targets.sh (kaldi-asr#3155) * [scripts] Avoid holding out more data than the requested num-utts (due to utt2uniq) (kaldi-asr#3141) * [src,scripts] Add support for two-pass agglomerative clustering. (kaldi-asr#3058) * [src] Disable unget warning in PeekToken (and other small fix) (kaldi-asr#3163) * [build] Add new nvidia tools to windows build (kaldi-asr#3159) * [doc] Fix documentation errors and add more docs for tcp-server decoder (kaldi-asr#3164)

Joseph-sun · 2019-11-12T03:38:05Z

Hello, I meet a problem with converter. It is :
"
steps/nnet3/convert_nnet2_to_nnet3.py exp/tri6b_nnet2 exp/tri6b_nnet3
2019-11-11 19:24:02,059 [convert_nnet2_to_nnet3.py:459 - Main - INFO ] Converting nnet2 model exp/tri6b_nnet2/final.mdl to nnet3 model exp/tri6b_nnet3/final.mdl
nnet-am-copy --remove-preconditioning=true --binary=false exp/tri6b_nnet2/final.mdl ./tmp5h7u8k_e/final.mdl
LOG (nnet-am-copy[5.5.4801-09abd]:main():nnet-am-copy.cc:207) Copied neural net from exp/tri6b_nnet2/final.mdl to ./tmp5h7u8k_e/final.mdl
2019-11-11 19:24:09,817 [convert_nnet2_to_nnet3.py:151 - write_config - INFO ] Writing config to ./tmp5h7u8k_e/config
2019-11-11 19:24:09,817 [convert_nnet2_to_nnet3.py:183 - write_config - WARNING ] Assuming linear objective.
nnet3-init --binary=true ./tmp5h7u8k_e/config ./tmp5h7u8k_e/nnet3.raw
ERROR (nnet3-init[5.5.4801-09abd]:InitFromConfig():nnet-simple-component.cc:1225) Could not process these elements in initializer: learning-rate=1e

[ Stack-Trace: ]
nnet3-init(kaldi::MessageLogger::LogMessage() const+0x82c) [0x6a750e]
nnet3-init(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x49b483]
nnet3-init(kaldi::nnet3::AffineComponent::InitFromConfig(kaldi::ConfigLine*)+0x55e) [0x4f4cce]
nnet3-init(kaldi::nnet3::Nnet::ProcessComponentConfigLine(int, kaldi::ConfigLine*)+0x3c7) [0x4b0f07]
nnet3-init(kaldi::nnet3::Nnet::ReadConfig(std::istream&)+0x292) [0x4b6898]
nnet3-init(main+0x3fd) [0x494d23]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f792f55e830]
nnet3-init(_start+0x29) [0x494859]

kaldi::KaldiFatalError
Traceback (most recent call last):
File "steps/nnet3/convert_nnet2_to_nnet3.py", line 486, in
Main()
File "steps/nnet3/convert_nnet2_to_nnet3.py", line 475, in Main
binary=args.binary)
File "steps/nnet3/convert_nnet2_to_nnet3.py", line 193, in write_model
.format(self.config, os.path.join(tmpdir, "nnet3.raw")))
File "steps/libs/common.py", line 158, in execute_command
p.returncode, command))
Exception: Command exited with status 255: nnet3-init --binary=true ./tmp5h7u8k_e/config ./tmp5h7u8k_e/nnet3.raw
"

the nnet2/final.mdl is converted from exp/tri6b_dnn/final.nnet
how to solve this problem
Thank you.

danpovey · 2019-11-12T03:40:10Z

Looks like it might be a tokenization issue. I'm afraid I didn't write that code (at least, don't recall writing it) and don't really maintain it myself. You could perhaps check in the git log who wrote it and ask them.

…

On Mon, Nov 11, 2019 at 7:38 PM Joseph-sun ***@***.***> wrote: Hello, I meet a problem with converter. It is : " steps/nnet3/convert_nnet2_to_nnet3.py exp/tri6b_nnet2 exp/tri6b_nnet3 2019-11-11 19:24:02,059 [convert_nnet2_to_nnet3.py:459 - Main - INFO ] Converting nnet2 model exp/tri6b_nnet2/final.mdl to nnet3 model exp/tri6b_nnet3/final.mdl nnet-am-copy --remove-preconditioning=true --binary=false exp/tri6b_nnet2/final.mdl ./tmp5h7u8k_e/final.mdl LOG (nnet-am-copy[5.5.4801-09abd]:main():nnet-am-copy.cc:207) Copied neural net from exp/tri6b_nnet2/final.mdl to ./tmp5h7u8k_e/final.mdl 2019-11-11 19:24:09,817 [convert_nnet2_to_nnet3.py:151 - write_config - INFO ] Writing config to ./tmp5h7u8k_e/config 2019-11-11 19:24:09,817 [convert_nnet2_to_nnet3.py:183 - write_config - WARNING ] Assuming linear objective. nnet3-init --binary=true ./tmp5h7u8k_e/config ./tmp5h7u8k_e/nnet3.raw ERROR (nnet3-init[5.5.4801-09abd]:InitFromConfig():nnet-simple-component.cc:1225) Could not process these elements in initializer: learning-rate=1e [ Stack-Trace: ] nnet3-init(kaldi::MessageLogger::LogMessage() const+0x82c) [0x6a750e] nnet3-init(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x49b483] nnet3-init(kaldi::nnet3::AffineComponent::InitFromConfig(kaldi::ConfigLine*)+0x55e) [0x4f4cce] nnet3-init(kaldi::nnet3::Nnet::ProcessComponentConfigLine(int, kaldi::ConfigLine*)+0x3c7) [0x4b0f07] nnet3-init(kaldi::nnet3::Nnet::ReadConfig(std::istream&)+0x292) [0x4b6898] nnet3-init(main+0x3fd) [0x494d23] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f792f55e830] nnet3-init(_start+0x29) [0x494859] kaldi::KaldiFatalError Traceback (most recent call last): File "steps/nnet3/convert_nnet2_to_nnet3.py", line 486, in Main() File "steps/nnet3/convert_nnet2_to_nnet3.py", line 475, in Main binary=args.binary) File "steps/nnet3/convert_nnet2_to_nnet3.py", line 193, in write_model .format(self.config, os.path.join(tmpdir, "nnet3.raw"))) File "steps/libs/common.py", line 158, in execute_command p.returncode, command)) Exception: Command exited with status 255: nnet3-init --binary=true ./tmp5h7u8k_e/config ./tmp5h7u8k_e/nnet3.raw " the nnet2/final.mdl is converted from exp/tri6b_dnn/final.nnet how to solve this problem Thank you. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#886?email_source=notifications&email_token=AAZFLO3F7YE4WR5TT2A5SIDQTIQKRA5CNFSM4CI5IA72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDY5NNQ#issuecomment-552720054>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO67TACFVJESK34DM43QTIQKRANCNFSM4CI5IA7Q> .

…3713) Parsing learning rates would fail if it was written in e-notation (e.g. 1e-5). Also fixes ivector-dim=0 which could result if const component dim=0 in src nnet2 model.

@spencerkirn

* [egs] New chime-5 recipe (kaldi-asr#2893) * [scripts,egs] Made changes to the augmentation script to make it work for ASR and speaker ID (kaldi-asr#3119) Now multi-style training with noise and reverberation is an option (instead of speed augmentation). Multi-style training seems to be more robust to unseen/noisy conditions. * [egs] updated local/musan.sh to steps/data/make_musan.sh in speaker id scripts (kaldi-asr#3320) * [src] Fix sample rounding errors in extract-segments (kaldi-asr#3321) With a segments file constructed from exact wave file durations some segments came out one sample short. The reason is the multiplication of the float sample frequency and double audio time point is inexact. For example, float 8000.0 multiplied by double 2.03 yields 16239.99999999999, one LSB short of the correct sample number 16240. Also changed all endpoint calculations so that they performed in seconds, not sample numbers, as this does not require a conversion in nearly every comparison, and report positions in diagnostic messages also in seconds, not sample numbers. * [src,scripts]Store frame_shift, utt2{dur,num_frames}, .conf with features (kaldi-asr#3316) Generate utt2dur and utt2num_frames during feature extraction, and store frame period in frame_shift file in feature directory. Copy relevant .conf files used in feature extraction into the conf/ subdirectory with features. Add missing validations and options in some extraction scripts. * [build] Initial version of Docker images for (CPU and GPU versions) (kaldi-asr#3322) * [scripts] fix typo/bug in make_musan.py (kaldi-asr#3327) * [scripts] Fixed misnamed variable in data/make_musan.py (kaldi-asr#3324) * [scripts] Trust frame_shift and utt2num_frames if found (kaldi-asr#3313) Getting utt2dur involves accessing wave files, and potentially running full pipelines in wav.scp, which may take hours for a large data set. If utt2num_frames exists, use it instead if frame rate is known. Issue: kaldi-asr#3303 Fixes: kaldi-asr#3297 "cat: broken pipe" * [scripts] typo fix in augmentation script (kaldi-asr#3329) Fixes typo in kaldi-asr#3119 * [scripts] handle frame_shit and utt2num_frames in utils/ (kaldi-asr#3323) subset_data_dir.sh has been refactored thoroughly so that its logic can be followed easier. It has been well tested and dogfooded. All changes here are necessary to subset, combine and verify utt2num_frames, and copy frame_shift to new directories where necessary. * [scripts] Extend combine_ali_dirs.sh to combine alignment lattices (kaldi-asr#3315) Relevant discussion: https://groups.google.com/forum/#!topic/kaldi-help/2uxfByEAmfw * [src] Fix rare case when segment end rounding overshoots file end in extract-segments (kaldi-asr#3331) * [scripts] Change --modify-spk-id default to False; back-compatibility fix for kaldi-asr#3119 (kaldi-asr#3334) * [build] Add easier configure option in failure message of configure (kaldi-asr#3335) * [scripts,minor] Fix typo in comment (kaldi-asr#3338) * [src,egs] Add option for applying SVD on trained models (kaldi-asr#3272) * [src] Add interfaces to nnet-batch-compute that expects device input. (kaldi-asr#3311) This avoids a ping pong of memory to host. Implementation now assumes device memory. interfaces will allocate device memory and copy to it if data starts on host. Add a cuda matrix copy function which clamps rows. This is much faster than copying one row at a time and the kernel can handle the clamping for free. * [build] Update GCC support check for CUDA toolkit 10.1 (kaldi-asr#3345) * [egs] Fix to aishell1 v1 download script (kaldi-asr#3344) * [scripts] Support utf-8 files in some scripts (kaldi-asr#3346) * [src] Fix potential underflow bug in MFCC, RE energy floor, thx: Zoltan Tobler (kaldi-asr#3347) * [scripts]: add warning to nnet3/chain/train.py about ineffective options (kaldi-asr#3341) * [scripts] Fix regarding UTF handling in cleanup script (kaldi-asr#3352) * [scripts] Change encoding to utf-8 in data augmentation scripts (kaldi-asr#3360) * [src] Add CUDA accelerated MFCC computation. (kaldi-asr#3348) * Add CUDA accelerated MFCC computation. Creates a new directory 'cudafeat' for placing cuda feature extraction components as it is developed. Added a directory 'cudafeatbin' for placing binaries that are cuda accelerated that mirrior binaries elsewhere. This commit implements: feature-window-cuda.h/cu which implements a feature window on the device by copying it from a host feature window. feature-mfcc-cuda.h/cu which implements the cuda mfcc feature extractor. compute-mfcc-feats-cuda.cc which mirriors compute-mfcc-feats.cc There were also minor changes to other files. * Only build cuda binaries if cuda is enabled * [src] Optimizations for batch nnet3. The issue fixed here is that (kaldi-asr#3351) small cuda memory copies are inefficeint because each copy can add multiple micro-seconds of latency. The code as written would copy a small matrices or vectors to and from the tasks one after another. To avoid this i've implemented a batched matrix copy routine. This takes arrays of matrix descriptions for the input and output and batches the copies in a single kernel call. This is used in both FormatInputs and FormatOutputs to reduce launch latency overhead. The kernel for the batched copy uses a trick to avoid a memory copy of the host paramters. The parameters are put into a struct containing a static sized array. These parameters are then marshalled like normal cuda parameters. This avoids additional launch latency overhead. There is still more work to do at the beginning and end of nnet3. In particular we may want to batch the clamped memory copies and the large number of D2D copies at the end. I haven't fully tracked those down and may return to them in the future. * [scripts,minor] Remove outdated comment (kaldi-asr#3361) * [egs] A kaldi recipe based on the corpus named "aidatatang_200zh". (kaldi-asr#3326) * [src] nnet1: changing end-rule in 'nnet-train-multistream', (kaldi-asr#3358) - end the training when there is no more data to refill one of the streams, - this avoids overtraining to the 'last' utterance, * [scripts] Fix how the empty (faulty?) segments are handled in data-cleanup code (kaldi-asr#3337) * [src] Fix to bug in ivector extraction causing assert failure, thx: sray (kaldi-asr#3364) * [src] Fix to bug in ivector extraction causing assert failure, thx: sray (kaldi-asr#3365) * [scripts] add script to compute dev PPL on kaldi-rnnlm (kaldi-asr#3340) * [scripts,egs] Small fixes to diarization scripts (kaldi-asr#3366) * [egs] Modify split_scp.pl usage to match its updated code (kaldi-asr#3371) * [src] Fix non-cuda `make depend` build by putting compile guards around header. (kaldi-asr#3374) * [build] Docker docs update and minor changes to the Docker files (kaldi-asr#3377) * [egs] Scripts for MATERIAL ASR (kaldi-asr#2165) * [src] Batch nnet3 optimizations. Batch some of the copies in and copies out (kaldi-asr#3378) * [build] Widen cuda guard in cudafeat makefile. (kaldi-asr#3379) * [scripts] nnet1: updating the scripts to support 'online-cmvn', (kaldi-asr#3383) * [build,src] Enhancements to the cudamatrix/cudavector classes. (kaldi-asr#3373) * Added CuSolver to the matrix class. This is only supported with Cuda 9.1 or newer. Calling CuSolver code without Cuda 9.1 or newer will result in a runtime error. This change required some changes to the build system which requires versioning the configure script. This forces everyone to reconfigure. Failure to reconfigure would result in linking and build errors on some systems. * [egs] Fix perl `use encoding` deprecation (kaldi-asr#3386) * [scripts] Add max_active to align_fmllr_lats.sh to prevent rare crashes (kaldi-asr#3387) * [src] Implemented CUDA acclerated online cmvn. (kaldi-asr#3370) This patch is part of a larger effort to implement the entire online feature pipeline in CUDA so that wav data is transfered to the device and never copied back to the host. This patch includes a new binary cudafeatbin/apply-cmvn-online.cc which for the most part matches online2bin/apply-cmvn-online. This binary is primarily for correctness testing and debugging as it makes no effort to compute multiple features in parallel on the device. The CUDA performance is dominiated by the cost of copying the feature to and from the device. While there is a small speedup I do not expect this binary to be used in production. Instead users will use the upcomming online-pipeline which will take features directly from the mfcc computation on the device and pass results to the next part of the pipeline. Summary of changes: Makefile: Added online2 dependencies to cudafeat, cudafeatbin, cudadecoder, and cudadecoderbin. cudafeat\: Makefile: added online2 dependency, added new .cu/.h files feature-online-cmvn-cuda.cu/h: implements online-cmvn in cuda. cudafeatbin\: Makefile: added new binary, added online2 dependency apply-cmvn-online-cuda.cc: binary which mimics online2bin/apply-cmvn-online Correctness testing: The correctness was tested by generating set of 20000 features and then running the CPU binary and GPU binary and comparing results using featbin/compare-feats. ../online2bin/apply-cmvn-online /workspace/models/LibriSpeech/ivector_extractor/global_cmvn.stats "scp:mfcc.scp" "ark,scp:cmvn.ark,cmvn.scp" ./apply-cmvn-online-cuda /workspace/models/LibriSpeech/ivector_extractor/global_cmvn.stats "scp:mfcc.scp" "ark,scp:cmvn-cuda.ark,cmvn-cuda.scp" ../featbin/compare-feats ark:cmvn-cuda.ark ark:cmvn.ark LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:105) self-product of 1st features for each column dimension: [ 5.52221e+09 9.1134e+09 5.92818e+09 7.42173e+09 7.48633e+09 7.21316e+09 6.9515e+09 7.03883e+09 6.40267e+09 5.83088e+09 5.01438e+09 5.1575e+09 4.28688e+09 3.529e+09 3.12182e+09 2.28721e+09 1.76343e+09 1.35117e+09 8.72517e+08 5.31836e+08 2.65112e+08 9.20308e+07 1.24084e+07 3.56008e+06 4.25283e+07 1.09786e+08 1.88937e+08 2.60207e+08 3.23115e+08 3.56371e+08 3.69035e+08 3.65216e+08 3.89125e+08 4.07064e+08 3.40407e+08 2.65444e+08 2.50244e+08 2.05726e+08 1.60606e+08 1.07217e+08 ] LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:106) self-product of 2nd features for each column dimension: [ 5.5223e+09 9.11355e+09 5.92812e+09 7.4218e+09 7.48666e+09 7.21338e+09 6.95174e+09 7.03895e+09 6.40254e+09 5.83113e+09 5.01411e+09 5.15774e+09 4.28692e+09 3.52918e+09 3.122e+09 2.28693e+09 1.76326e+09 1.3513e+09 8.72521e+08 5.31802e+08 2.65137e+08 9.20296e+07 1.2408e+07 3.5604e+06 4.25301e+07 1.09793e+08 1.88933e+08 2.60217e+08 3.23124e+08 3.56371e+08 3.69007e+08 3.65176e+08 3.89104e+08 4.07067e+08 3.40416e+08 2.65498e+08 2.50196e+08 2.057e+08 1.60612e+08 1.07192e+08 ] LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:107) cross-product for each column dimension: [ 5.52209e+09 9.11229e+09 5.92538e+09 7.41665e+09 7.47877e+09 7.20269e+09 6.93785e+09 7.02284e+09 6.38411e+09 5.81143e+09 4.99389e+09 5.13753e+09 4.26792e+09 3.51154e+09 3.10676e+09 2.27436e+09 1.75322e+09 1.34367e+09 8.67367e+08 5.28672e+08 2.63516e+08 9.14194e+07 1.23215e+07 3.53409e+06 4.21905e+07 1.08872e+08 1.87238e+08 2.57779e+08 3.19827e+08 3.5252e+08 3.64691e+08 3.60529e+08 3.84482e+08 4.02396e+08 3.36136e+08 2.61631e+08 2.46931e+08 2.03079e+08 1.5856e+08 1.05738e+08 ] LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:111) Similarity metric for each dimension [ 0.99997 0.999871 0.999532 0.999311 0.998968 0.998533 0.998019 0.997719 0.997111 0.996644 0.995941 0.996104 0.995572 0.995028 0.995147 0.994445 0.994258 0.994402 0.994095 0.994084 0.993934 0.993363 0.993015 0.992655 0.992037 0.991645 0.991017 0.990649 0.98981 0.989195 0.988267 0.987222 0.988093 0.98853 0.987442 0.985534 0.986858 0.987196 0.987242 0.986318 ] (1.0 means identical, the smaller the more different) LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:116) Overall similarity for the two feats is:0.993119 (1.0 means identical, the smaller the more different) LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:119) Processed 20960 feature files, 0 had errors. LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:126) Features are considered similar since 0.993119 >= 0.99 * [egs] Fixed file path RE augmentation, in aspire recipe (kaldi-asr#3388) * [scripts] Update taint_ctm_edits.py, RE utf-8 encoding (kaldi-asr#3392) * [src] Change nnet3-am-copy to allow more manipulations (kaldi-asr#3393) * [egs] Remove confusing setting of overridden num_epochs variable in aspire (kaldi-asr#3394) * [build] Add a missing dependency for "decoder" in Makefile (kaldi-asr#3397) * [src] CUDA decoder performance patch (kaldi-asr#3391) * [build,scripts] Dependency fix; add cross-references to scripts (kaldi-asr#3400) * [egs] Fix cleanup-after-partial-download bug in aishell (kaldi-asr#3404) * [src] Change functions like AppiyLog() to all work out-of-place (kaldi-asr#3185) * [src] Make stack trace display more user friendly (kaldi-asr#3406) * [egs] Fix to separators in Aspire reverb recipe (kaldi-asr#3408) * [egs] Fix to separators in Aspire, related to kaldi-asr#3408 (kaldi-asr#3409) * [src] online2-tcp, add option to display start/end times (kaldi-asr#3399) * [src] Remove debugging assert in cuda feature extraction code (kaldi-asr#3411) * [scripts] Fix to checks in adjust_unk_graph.sh (kaldi-asr#3410) bash test `-f` does not work for `phones/` which is a directory. Changed it to `-e`. * [src] Added GPU feature extraction (will improve speed of GPU decoding) (kaldi-asr#3390) Currently only supports MFCC features. * [src] Fix build error introducted by race condition in PR requests/accepts. (kaldi-asr#3412) * [src] Added error string to CUDA allocation errors. (kaldi-asr#3413) * [src] Fix CUDA_VSERION number in preprocessor checks (kaldi-asr#3414) * [src] Fix build of online feature extraction with older CUDA version (kaldi-asr#3415) * [src] Update Insert function of hashlist and decoders (kaldi-asr#3402) makes interface of HashList more standard; slight speed improvement. * [src] Fix spelling mistake in kaldi-asr#3415 (kaldi-asr#3416) * [build] Fix configure bug RE CuSolver (kaldi-asr#3417) * [src] Enable an option to use the GPU for feature extraction in GPU decoding (kaldi-asr#3420) This is turned on by using the option --gpu-feature-extract=true. By default this is on. We provie the option to turn it off because in situations where CPU resources are unconfined you can get slightly higher performance with CPU feature extraction but in most cases GPU feature extraction is faster and has more stable performance. In addition a user may wish to turn it off to support models where feature extraction is currently incomplete (e.g. FBANK, PLP, PITCH, etc). We will add those features in the future but for now a user wanted to decode those models should place feature extraction on the host. * [egs] Replace $cuda_cmd with $train_cmd for FarsDat (kaldi-asr#3426) * [src] Remove outdated comment (kaldi-asr#3148) (kaldi-asr#3422) * [src] Adding missing thread.join in CUDA decoder and fixing two todos (kaldi-asr#3428) * [build] Add missing lib dependency in cudafeatbin (kaldi-asr#3427) * [egs] Small fix to aspire run_tdnn_7b.sh (kaldi-asr#3429) * [build] Fix to cuda makefiles, thanks: yiyidhuang@gmail.com (kaldi-asr#3431) * [build] Add missing deps to cuda makefiles, thanks: yiyidhuang@gmail.com (kaldi-asr#3432) * [egs] Fix encoding issues in Chinese ASR recipe (kaldi-asr#3430) (kaldi-asr#3434) * Revert "[src] Update Insert function of hashlist and decoders (kaldi-asr#3402)" (kaldi-asr#3436) This reverts commit 5cc7ce0. * [src] Update Insert function of hashlist and decoders (kaldi-asr#3402) (kaldi-asr#3438) makes interface of HashList more standard; slight speed improvement. Fixed version of kaldi-asr#3402 * [build] Fix the cross-compiling issue for Android under MacOS (kaldi-asr#3435) * [src] Marking operator as __host__ __device__ to avoid build issues (kaldi-asr#3441) avoids cudafeat build failures with some CUDA toolkit versions * [egs] Fix perl encoding bug (was causing crashes) (kaldi-asr#3442) * [src] Cuda decoder fixes, efficiency improvements (kaldi-asr#3443) * [scripts] Fix shebang of taint_ctm_edits.py to invoke python3 directly (kaldi-asr#3445) * [src] Fix to a check in nnet-compute code (kaldi-asr#3447) * [src,scripts] Various typo fixes and stylistic fixes (kaldi-asr#3153) * [scripts] Scripts for VB (variational bayes) resegmentation for Callhome diarization (kaldi-asr#3305) This refines the segment boundaries. Based on code originally by Lukas Burget from Brno. * [scripts] Extend utils/data/subsegment_data_dir.sh to copy reco2dur (kaldi-asr#3452) * [src,scripts,egs] Add code and example for SpecAugment in nnet3 (kaldi-asr#3449) * [scripts] Make segment_long_utterance honor frame_shift (kaldi-asr#3455) * [scripts] Fix to steps/nnet/train.sh (nnet1) w.r.t. incorrect bash test expressions (kaldi-asr#3456) * [egs] fixed bug in egs/gale_arabic/s5c/local/prepare_dict_subword.sh that it may delete words matching '<*>' (kaldi-asr#3465) * [src,build] Small fixes (kaldi-asr#3472) * [src] fixed warning: moving a temporary object prevents copy elision * [scripts] tools/extras/check_dependencies.sh look for alternative MKL locations via MKL_ROOT environment variable * [src] Fixed compilation error when DEBUG is defined * [egs] Add MGB-2 Arabic recipe (kaldi-asr#3333) * [scripts] Check/fix utt2num_frames when fixing data dir. (kaldi-asr#3482) * [src] A couple small bug fixes. (kaldi-asr#3477) * [src] A couple small bug fixes. * [src] Fix RE pdf-class 0, which is valid in pre-kaldi10 * [src,scripts] Cosmetic,file-mode fixes; fix to nnet1 align.sh introduced in kaldi-asr#3383 (kaldi-asr#3487) * [src] Small cosmetic and file-mode fixes * [src] Fix bug in nnet1 align.sh introduced in kaldi-asr#3383 * [egs] Add missing script in MGB2 recipe (kaldi-asr#3491) * [egs] Fixing nnet1 but introduced in kaldi-asr#3383 (rel. to kaldi-asr#3487) (kaldi-asr#3494) * [src] Fix for nnet3 bug encountered when implementing deltas. (kaldi-asr#3495) * [scripts,egs] Replace LDA layer with delta and delta-delta features (kaldi-asr#3490) WER is about the same but this is simpler to implement and more standard. * [egs] Add updated tdnn recipe for AMI (kaldi-asr#3497) * [egs] Create MALACH recipe based on s5b for AMI (kaldi-asr#3496) * [scripts] add --phone-symbol-table to prepare_lang_subword.sh (kaldi-asr#3485) * [scripts] Option to prevent crash when adapting on much smaller data (kaldi-asr#3506) * [build,scripts] Make OpenBLAS install check for gfortran; documentation fix (kaldi-asr#3507) * [egs] Update chain TDNN-F recipe for CHIME s5 to match s5b, improves results (kaldi-asr#3505) * [egs] Fix to kaldi-asr#3505: updating chime5 TDNN-F script (kaldi-asr#3508) * [scripts] Fixed issue that leads to empty segment file (kaldi-asr#3510) Fixed issue that leads to empty segment file on SSD disk (more detail: https://groups.google.com/forum/#!topic/kaldi-help/Ij3lQLCinN8) * [egs] Fix bug in AMI s5b RE overlapping segments that causes Fixed overlap segment bug (kaldi-asr#3503) * [egs] Small cosmetic change in extend_vocab_demo.sh (kaldi-asr#3516) * [src] Cosmetic changes; fix windows-compile bug reported by @spencerkirn (kaldi-asr#3515) * [src] Move cuda gpu from nnetbin to nnet3bin. (kaldi-asr#3513) * fix a bug in egs/voxceleb/v1/local/make_voxceleb1_v2.pl when preparing the file data/voxceleb1_test/trials (kaldi-asr#3512) * [egs] Fixed some bugs in mgb_data_prep.sh of mgb2_arabic (kaldi-asr#3501) * [src,scripts] fix various typos and errors in comments (kaldi-asr#3454) * [src] Move cuda-compiled to nnet3bin (kaldi-asr#3517) * [src] Fix binary name in Makefile, RE cuda-compiled (kaldi-asr#3518) * [src] buffer fix in cudafeat (kaldi-asr#3521) * [src] Hopefully make it possible to use empty FST in grammar-fst (kaldi-asr#3523) * [src] Add option to convert pdf-posteriors to phones (kaldi-asr#3526) * [src] Fix GetDeltaWeights for long-running online decoding (kaldi-asr#3528) * Fix GetDeltaWeights for long-running online decoding. Use frame count relative to decoder start internally in silence weighting and update to the frame count relative to pipeline only once result is calculated. * Add a note about transition to a new function API * [src] Small fix to post-to-phone-post.cc (problem introduced in from kaldi-asr#3526) (kaldi-asr#3534) * [src]: adding Dan's fix to a bug in nnet-computation-graph (kaldi-asr#3531) * [egs] Replace prep_test_aspire_segmentation.sh (kaldi-asr#2943) (kaldi-asr#3530) * [egs] OCR: Decomposition for CASIA and YOMDLE_ZH datasets (kaldi-asr#3527) * [build] check_dependencies.sh: correct installation command for fedora (kaldi-asr#3539) * [src,doc] Fix bug in new option of post-to-phone-post; skeleton of faq page (kaldi-asr#3540) * [egs,scripts] Adding possibility to have 'online-cmn' on input of 'nnet3' models (kaldi-asr#3498) * [scripts] Fix to build_tree_multiple_sources.sh (kaldi-asr#3545) * [doc] Fix accidental overwriting of kws page (kaldi-asr#3541) * [egs] Fix regex filter in Fisher preprocessing (was excluding 2-letter sentences like "um") (kaldi-asr#3548) * [scripts] Fix to bug introduced in kaldi-asr#3498 RE ivector-extractor options mismatch. (kaldi-asr#3549) * [scripts] Fix awk compatibility issue; be more careful about online_cmvn file (kaldi-asr#3550) * [src] Add a method for backward-compatibility with previous API (kaldi-asr#3536) * [src] Feature bank feature extraction using CUDA (kaldi-asr#3544) Following this change, both MFCC and fbank run through a single code path with parameters (use_power, use_log_fbank and use_dct) controlling the flow. CudaMfcc has been renamed to CudaSpectralFeatures. It contains an MfccOptions structure which contains FrameOptions and MelOptions. It can be initialized either with an MfccOptions object or an FbankOptions object. Compared with CudaMfccOptions, CudaSpectralOptions also contains these parameters use_dct - switches on the discrete cosine and lifter use_log_fbank - takes the log of the MEL banks values use_power - uses power in place of abs(amplitude) Each of these is defaulted on for MFCC. For fbank, use_dct is set to false. The others are set by user parameters. Also added a unit test for CUDA Fbank (cudafeatbin/compute-fbank-feats-cuda). * [src] Fix missing semicolon (kaldi-asr#3551) * [src] fix a typo mistaking not equal for assign sign in CUDA feature pipeline (kaldi-asr#3552) * [src] Fix issue kaldi-asr#3401 (crash in ivector extraction with max-remembered-frames+silence-weight) (kaldi-asr#3405) * [egs] Librispeech: in RESULTS, change best_wer.sh => utils/best_wer.sh (kaldi-asr#3553) * [egs] semisupervised recipes: fixing some variables in comments (kaldi-asr#3547) * [scripts] fix utils/lang/extend_lang.sh to add nonterm symbols in align_lexicon.txt (kaldi-asr#3556) * [scripts] Fix to bug in steps/data/data_dir_manipulation_lib.py (kaldi-asr#3174) * [src] Fix in nnet3-attention-component.cc, RE required context (kaldi-asr#3563) * [src] Temporarily turn off some model-collapsing code while investigating a bug (kaldi-asr#3565) * [scripts] Fix to data cleanup script RE utf-8 support in uttearnce * [src,scripts,egs] online-cmvn for online2 with chain models, (kaldi-asr#3560) * Add OnlineCMVN to Online NNET2 pipeline. This is used in some models (e.g. CVTE). This is optional and off by default. It applies CMVN before applying pitch. This code is essentially copied out of "online2/online-feature-pipeline.cc/h". Patch set provided by Levi Barnes. * online2: bugfix of config script, include ivector_period into the iextractor config file. * online-cmvn in online-nnet2-feature-pipeline, - update the C++ code from @luitjens, - introduced `OnlineFeatureInterface *nnet3_feature_` to explictly mark features that are passed to nnet3 model - added the transfer of OnlineCmvnState across utterances from same speaker - update the 'prepare_online_decoding.sh' to support online-cmvn - enabled OnlineCmvnStats transfer in training/decoding * OnlineNnet2FeaturePipeline, removing unused constructor, updating results * [build,src] Change to error message; update kaldi_lm install script (and kaldi_lm) to fix compile issue (kaldi-asr#3581) * [src] Clarify something in plda.h (cosmetic change) (kaldi-asr#3588) * [src] Small fix to online ivector extraction (RE kaldi-asr#3401/kaldi-asr#3405), thanks: Vladmir Vassiliev. (kaldi-asr#3592) Stops an assert failure by changing the assert condition * [egs] Add recipe for Chinese training with multiple databases (kaldi-asr#3555) * [src] Remove duplicate `num_done++` in apply-cmvn-online.cc (kaldi-asr#3597) * [src] Fix to CMVN with CUDA (kaldi-asr#3593) * * __restrict__ not __restrict__ * Fixing a complaint during Windows compilation * Plumbing for CUDA CMVN * Removing detritus. Removing comments, an unused variable and NULL'ing some object pointers (just in case) * [src] Fix two bugs in batched-wav-nnet3-cuda binary. (kaldi-asr#3600) 1) Using the "Any" apis prior to finishing submission of the full group could lead to a group finishing early. This would cause output to appear repeatedly. I have not seen this occur but an audit revealed it as an issue. The fix is to use the "Any" APIs only when full groups have been submitted. 2) GetNumberOfTasksPending() can return zero even though groups have not been waited for. This API call should not be used to determine if all groups have been completed as the number of pending tasks is independent of the number of groups remaining. * [scripts] propagated bad whitespace fix to validate_dict_dir.pl (cosmetic change) (kaldi-asr#3601) * [scripts] bug-fix on subset_data_dir.sh with --per-spk option (kaldi-asr#3567) (kaldi-asr#3572) The script failed to return the number of utterances `per-spk` requested, despite being available in the original spk2utt file. The stop-boundary on the awk's for-loop was off by 1. > An example of how affects the spk2utt files ```$ cat tmpData/spk2utt spk1 spk1-utt1 spk1-utt2 spk1-utt3 spk2 spk2-utt1 spk3 spk3-utt1 spk3-utt2 spk4 spk4-utt1 spk4-utt2 spk4-utt3 spk4-utt4 $ $ ./subset_data_dir.sh --per-spk tmpData 2 tmpData-before $ cat tmpData-before/spk2utt spk1 spk1-utt1 spk2 spk2-utt1 spk3 spk3-utt1 spk4 spk4-utt1 spk4-utt3 $ $ ./subset_data_dir_MOD.sh --per-spk tmpData 2 tmpData-after $ cat tmpData-after/spk2utt spk1 spk1-utt1 spk1-utt2 spk2 spk2-utt1 spk3 spk3-utt1 spk3-utt2 spk4 spk4-utt1 spk4-utt3 ``` * [src] Code changes to support GCC9 + OpenFST1.7.3 + C++2a (namespace issues) (kaldi-asr#3570) * [scripts] Training ivector-extractor: make online-cmvn per speaker. (kaldi-asr#3615) * [src] cached compiler I/O for nnet3-xvector-compute (kaldi-asr#3197) * [src] Fixed online2-tcp-nnet3-decode-faster.cc poll_ret checks (kaldi-asr#3611) * [scripts] Call utils/data/get_utt2dur.sh using the correct $cmd and $nj (kaldi-asr#3608) * [scripts] Enable tighter control of downloaded dependencies (kaldi-asr#3543) (kaldi-asr#3573) * [scripts] Make reverberate_data_dir.py handle vad.scp (kaldi-asr#3619) * [scripts] Don't get utt2dur in librispeech.. will be made by make_mfcc.sh now (kaldi-asr#3610) (kaldi-asr#3620) * [scripts] Make combine_ali_dirs.sh work when queue.pl is used (kaldi-asr#3537) (kaldi-asr#3561) * [egs] Fix duplicate removal of unk from Librispeech decoding graphs (kaldi-asr#3476) (kaldi-asr#3621) * [build] Check for gfortran, needed by OpenBLAS (for lapack) (kaldi-asr#3622) * [scripts] VB resegmentation: load MFCCs only when used (save memory) (kaldi-asr#3612) * [build] removed old Docker files - see docker in the root folder for the latest files (kaldi-asr#3558) * [src] Fix to compute-mfcc-feats.cc (thanks: @puneetbawa) (kaldi-asr#3623) * [egs] Update AMI tdnn recipe to reflect what was really run, and add hires feats. (kaldi-asr#3578) * [egs] Fix to run_ivector_common.sh in swbd (crash when running from stage 3) (kaldi-asr#3631) * [scripts] Make data augmentation work with UTF-8 utterance ids/filenames (kaldi-asr#3633) * [src] fix a bug in src/online/online-faster-decoder.cc (prevent segfault with some models) (kaldi-asr#3634) * [scripts] Make extend_lang.sh support lexiconp_silprob.txt (kaldi-asr#3339) (kaldi-asr#3632) * [scripts] Fix typo in analyze_alignments.sh (kaldi-asr#3635) * [scripts] Change the GPU policy of align.sh to wait instead of yes (kaldi-asr#3636) This is needed to avoid failures when using more number jobs than number of GPUs. Previously when --use-gpu=yes, when all GPUs are occupied, the job will error out in 20s. Changing to the policy to wait will have correct behavior. * [egs] Added tri3b and chain training for Aurora4 (kaldi-asr#3638) * [build] fixed broken Docker builds by adding gfortran package (kaldi-asr#3640) * [src] Fix bug in resampling checks for online features (kaldi-asr#3639) The --allow-upsample and --allow-downsample options were not handled correctly in the code that handles resampling for computing online features. * [build] Bump OpenBLAS version to 0.3.7 and enable locking (kaldi-asr#3642) Bumps OpenBLAS version to 0.3.7 for improved performance with Zen architecture. Also sets USE_LOCKING which fixes issue when calling a single-threaded OpenBLAS from several threads in parallel * [scripts] Update nnet3_to_dot.py to ignore Scale descriptors (kaldi-asr#3644) * [src] Speed optimization for decoders (call templates properly) (kaldi-asr#3637) * [doc] update FAQ page to include some FAQ candidates from Kaldi mailing lists (kaldi-asr#3646) * [egs] Small fix: duplicate xent settings from examples (kaldi-asr#3649) * [doc] Fix typo (kaldi-asr#3648) * [egs] Fix a bug in Tedlium run_ivector_common.sh (kaldi-asr#3647) Running perturb_data_dir_speed_3way.sh after making MFCC would cause an error, saying "feats.scp already exists". It is also uncessary to run it twice. * [src] Fix to matrix-lib-test to ignore small difference (kaldi-asr#3650) * [doc] update FAQ page: (kaldi-asr#3651) 1. Added some new FAQ candiates from mailing lists. 2. Added section for logo, version and books recommendation for beginners. * [scripts] Modify split_data.sh to split data evenly when utt2dur exists (kaldi-asr#3653) * [doc] update FAQ page: added section for free dataset, python wrapper, etc. (kaldi-asr#3652) * [src] Some CUDA i-vector fixes (kaldi-asr#3660) no longer assume we are starting at frame 0. This is needed for eventual online decoding. src/cudafeat/online-ivector-feature-cuda.cc: Added code to use LU-decomposition for debugging. Cholesky's is still used and works fine but this is a good test if something wrong is suspected in the solver. Added support for older toolkits but with a less efficient algorithm. * [src] CUDA batched decoder pipeline fixes (kaldi-asr#3659) bug fix: don't assume stride and number of colums is the same when packing matrix into a vector. src/cudadecoder/batched-threaded-nnet3-cuda-pipeline.cc: added additional sanity checking to better report errors in the field. no longer pinning memory for copying waves down. This was causing consistency issues. It is unclear why the code is not working and will continue to evaluate. This optimization doesn't add a lot of perf so we are disabling it for now. general cleanup fixed bug with tasks array being possibly being resized before being read. src/cudadecoderbin/batched-wav-nnet3-cuda.cc: Now outputting every iteration as a different lattice. This way we can score every lattice and better ensure correctness of binary. clang-format (removing tabs) * [egs] Small fix to aspire example: pass in num_data_reps (kaldi-asr#3665) * [doc] Update FAQ page (kaldi-asr#3663) 1. Adde section of Android for Kaldi 2. list some useful scripts in data processing 3. Illustrate some common tools * [src] Add batched xvector computation (kaldi-asr#3643) * [src] CUDA decoder: write all output to single stream (kaldi-asr#3666) .. instead of one per iteration. We are seeing that small corpus are dominated by writer Open/Close. A better solution is to modify the key with the iteration number and then handle the modified keys in scoring. Note that if you have just a single iteration the key is not modified and thus behavior is as expected. * [egs] Fix sox command in multi-Updated sox command (kaldi-asr#3667) * [src] Change data-type for hash in lattice alignment (avoid debugger complaints) (kaldi-asr#3672) * [egs] Two fixes to multi_en setup (kaldi-asr#3676) * [scripts] Fix misspelled variable in reverb script (kaldi-asr#3678) * Update cudamatrix.dox (kaldi-asr#3674) * [scripts] Add reduced-context option for TDNN-F layers. (kaldi-asr#3658) * [egs] Make librispeech run_tdnn_discriminative.sh more efficient with disk (kaldi-asr#3685) * [egs] Remove pitch from multi_cn nnet3 recipe (kaldi-asr#3686) * [egs] multi_cn: clarify metric is CER (kaldi-asr#3687) * [src,minor] Fix typo in comment in kaldi-holder.h (kaldi-asr#3691) * [egs] update run_tdnn_discriminative.sh in librispeech recipe (kaldi-asr#3689) * [egs] Aspire recipe: fixed typo in utterance and speaker prefix (kaldi-asr#3696) * [src] cuda batched decoder, fix memory bugs (kaldi-asr#3697) bug fix: Pass token by reference so we can return an address of it. Previously it was done by value which means we returned an address of a locally scoped variable. This lead to memory corruptions. re-add pinned memory (i.e. disable workaround) Free memory on control thread, switch shared_ptr to unique_ptr, don't reset unique ptr. * [scripts] Change make_rttm.py to read/write files with UTF-8 encoding (kaldi-asr#3705) * [src] Removing non-compiling paranoid asserts in nnet-computation-graph. (kaldi-asr#3709) * [build] Fix gfortran package name for centos (kaldi-asr#3708) * [scripts] Change the Python diagnostic scripts to accept non-ASCII UTF-8 phone set (kaldi-asr#3711) * [scripts] Fix some issues in kaldi-asr#3653 in split_scp.pl (kaldi-asr#3710) * [scripts] Fix 2 issues in nnet2->nnet3 model conversion script (kaldi-asr#886)(kaldi-asr#3713) Parsing learning rates would fail if it was written in e-notation (e.g. 1e-5). Also fixes ivector-dim=0 which could result if const component dim=0 in src nnet2 model. * Removing changes to split_scp.pl (kaldi-asr#3717) * [scripts] Improve how combine_ali_dirs.sh gets job-specific filenames (kaldi-asr#3720) * [src] Add --debug-level=N configure option to control assertions (kaldi-asr#3690) (kaldi-asr#3700) * [src] Adding some more feature extraction options (needed by some users..) (kaldi-asr#3724) * [src,script,egs] Goodness of Pronunciation (GOP) (kaldi-asr#3703) * [src] Making ivector extractor tolerate dim mismatch due to pitch (kaldi-asr#3727) * Revert "[src] Making ivector extractor tolerate dim mismatch due to pitch (kaldi-asr#3727)" (kaldi-asr#3728) This reverts commit 59255ae. * [src] Fix NVCC compilation errors on Windows (kaldi-asr#3741) * make nvcc+msvc happier when two-phase name lookup involved, NOTE: in simple native case (CPU, without cuda), msvc is happy with TemplatedBase<T>::base_value, but nvcc is not... * position of __restrict__ in msvc is restricted * [build] Add CMake Build System as alternative to current Makefile-based build (kaldi-asr#3580) * [scripts] Modify split_data_dir.sh and split_scp.pl to use utt2dur if present, to balance splits * [scripts] fix slurm.pl error (kaldi-asr#3745) * Revert "[scripts] Modify split_data_dir.sh and split_scp.pl to use utt2dur if present, to balance splits" (kaldi-asr#3746) This reverts commit 1d0b267. * [egs] Children's speech ASR recipe for cmu_kids and cslu_kids (kaldi-asr#3699) * [src] Incremental determinization [cleaned up/rewrite] (kaldi-asr#3737) * [scripts] Add scripts to create combine fmllr-tranform dirs(kaldi-asr#3752) * [src] CUDA decoder: fix invalid-lattice error that happens in corner cases (kaldi-asr#3756) * [egs] Add Chime 6 baseline system (kaldi-asr#3755) * [scripts] Fix issue in copy_lat_dir.sh affecting combine_lat_dirs.sh (missing phones.txt) (kaldi-asr#3757) * [src] Add missing #include, needed for CUDA decoder compilation on some platforms (kaldi-asr#3759) * [scripts] fix bug in steps/data/reverberate_data_dir.py (kaldi-asr#3762) * [src] CUDA allocator: fix order of next largest block (kaldi-asr#3739) * [egs] some fixes in Chime6 baseline system (kaldi-asr#3763) * changed scoring tool for diarization * added comment for scoring * fixing number of deletions, adding script to check DP result of the total errors is equivalent to the sum of the individual errors * updated RESULTS for new diarization scoring * outputing wer similar to compute-wer routine * adding routine to select best LM weight and insertion penalty factor based on the development set * updating results * changing lang_chain to lang, minor fix * adding all array option * change in directory structure of scoring_kaldi_multispeaker to make it similar to scoring_kaldi * removing test sets from run ivector script * added ref RTTM creation * making modifications for all array * minor fix * [src] CUDA decoding: add support for affine transforms to CUDA feature extractor (kaldi-asr#3764) * [src] relax assertion constraint slightly (RE matrix orthonormalization) (kaldi-asr#3767) * [src] CUDA decoder: fix bug in NumPendingTasks() (kaldi-asr#3769) * [src] Add options to select specific gpu, reuse cuda context (kaldi-asr#3750) * [src] Move CheckAndFix to config struct (kaldi-asr#3749) * [egs,scripts] Add recipes for CN-Celeb (kaldi-asr#3758) * [src] CUDA decoder: remove unecessary sync that was added for debugging (kaldi-asr#3770) * [src] CUDA decoder: shrink channel vectors instead of vector holding those vectors (kaldi-asr#3768) * add include path * update test

…-asr#886)(kaldi-asr#3713) Parsing learning rates would fail if it was written in e-notation (e.g. 1e-5). Also fixes ivector-dim=0 which could result if const component dim=0 in src nnet2 model.

vijayaditya added enhancement help wanted Please help us with this issue! labels Jul 7, 2016

jfainberg mentioned this issue May 7, 2017

Python script to convert nnet2 to nnet3 models #1611

Merged

jfainberg mentioned this issue Mar 4, 2019

Read const_component_dim during conversion of nnet2 to nnet3 model #3071

Merged

danpovey closed this as completed in #3071 Mar 4, 2019

danpovey pushed a commit that referenced this issue Mar 4, 2019

[scripts] bug fix to nnet2->3 conversion, fixes #886 (#3071)

d21be2d

chenzhehuai mentioned this issue Jun 4, 2019

update (#32) chenzhehuai/kaldi#33

Closed

jfainberg mentioned this issue Nov 12, 2019

Fix parsing of learning rates and ivector-dim=0 #3713

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nnet2 to nnet3 model converter #886

nnet2 to nnet3 model converter #886

vijayaditya commented Jul 7, 2016

jfainberg commented Aug 26, 2016

vijayaditya commented Aug 26, 2016

jfainberg commented Sep 6, 2016 •

edited

Loading

danpovey commented Sep 6, 2016

yujunlhz commented Apr 19, 2017

danpovey commented Apr 19, 2017 via email

jfainberg commented Apr 19, 2017

KnowBetterHelps commented Jun 13, 2018

danpovey commented Jun 13, 2018 via email

KnowBetterHelps commented Jun 13, 2018

danpovey commented Jun 13, 2018 via email

KnowBetterHelps commented Jun 13, 2018

ltbringer commented Mar 1, 2019

jfainberg commented Mar 1, 2019

ltbringer commented Mar 1, 2019

ltbringer commented Mar 2, 2019

Kirito0816 commented Apr 24, 2019

danpovey commented Apr 24, 2019

Kirito0816 commented Apr 24, 2019

danpovey commented Apr 24, 2019 via email

Joseph-sun commented Nov 12, 2019

danpovey commented Nov 12, 2019 via email

nnet2 to nnet3 model converter #886

nnet2 to nnet3 model converter #886

Comments

vijayaditya commented Jul 7, 2016

jfainberg commented Aug 26, 2016

vijayaditya commented Aug 26, 2016

jfainberg commented Sep 6, 2016 • edited Loading

danpovey commented Sep 6, 2016

yujunlhz commented Apr 19, 2017

danpovey commented Apr 19, 2017 via email

jfainberg commented Apr 19, 2017

KnowBetterHelps commented Jun 13, 2018

danpovey commented Jun 13, 2018 via email

KnowBetterHelps commented Jun 13, 2018

danpovey commented Jun 13, 2018 via email

KnowBetterHelps commented Jun 13, 2018

ltbringer commented Mar 1, 2019

jfainberg commented Mar 1, 2019

ltbringer commented Mar 1, 2019

ltbringer commented Mar 2, 2019

Kirito0816 commented Apr 24, 2019

danpovey commented Apr 24, 2019

Kirito0816 commented Apr 24, 2019

danpovey commented Apr 24, 2019 via email

Joseph-sun commented Nov 12, 2019

danpovey commented Nov 12, 2019 via email

jfainberg commented Sep 6, 2016 •

edited

Loading