New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: add DFSMN related codes and example scripts #2437
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
I don't think there is a good chance that I will want to commit this, as it's a lot of extra code, it uses the nnet1 framework which is kind of deprecated, and it's not really clear that DFSMN has an advantage over the current TDNN-F scripts of the 'chain' models (which are presumably much faster to train and test).
But it's definitely interesting work; @GaofengCheng has been experimenting with things based on this.
@@ -0,0 +1,231 @@ | |||
// featbin/append-ivector-to-feats.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this file is probably not necessary; the nnet2 scripts already do this task based on existing binaries.
@@ -345,16 +345,7 @@ template<typename Real> void CuVectorUnitTestSum() { | |||
A.SetRandn(); | |||
ones.Set(1.0); | |||
|
|||
Real x = VecVec(A, ones); | |||
Real y = A.Sum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phis and a couple of other files seem to have some changes that are reversions of recent changes to master. Possibly a sign of some weirdness in your use of git?
//// FSMN kernel functions /////// | ||
//////////////////////////////////////////////////// | ||
|
||
// forward operation in memory block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to commit this stuff I'd probably want these to be non-member functions as they are pretty specialized-- maybe in cu-math.h. But see my overall review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the review and suggestion. I am agree that add these DFSMN specialized kernel function is conflict with the code style of kaldi. We will try to think of some alternative methods although it may not efficient as the CUDA kernel function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just saying they should be non-member functions, they would still be CUDA kernels.
But that was a minor point, and fixing it would likely not make me want to merge this.
My problem is a little more fundamental-- it's that there is a lot of code, it's based on a setup that is basically deprecated, and it provides no clear advantage versus our current best models based on TDNN-F.
Hi, sorry for not reacting sooner. I like the idea of a having a new 'nnet1' model that will be better than the current one... For the moment, I am busy with other project, but I'll put a note to return to this later... Thanks, K. |
linearity_.Resize(OutputDim(), InputDim()); | ||
RandGauss(0.0, param_stddev, &linearity_); | ||
// if Xavier_flag=1, use the “Xavier” initialization | ||
if(xavier_flag){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this 'special' initialization hard-coded?
- Xavier Glorot's initialization was for the networks only.
- It is possible to set the mean/variance of the initialization externally.
Was it helpful in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Xavier Glorot's initialization is suitable for networks with ReLU activation. In our previous works (https://pdfs.semanticscholar.org/d39f/468f285fd4034bceaa37b9f3b7e2285f99fe.pdf), we found that ReLU-DNN is more sensitive to dynamic range of initial weights and the scaled-Xavier Glorot's initialization (by adding a beta=0.5) provides a good initialization.
- Setting the mean/variance of the initialization externally is OK. But for NN with different hidden layer size we need to calculate the dynamic range and then set the Gaussian variance.
then -> done fi
Modify the ivector folder name
revise proto dir
Guys, I just wanted to mention that @GaofengCheng has been working on various DFSMN-related ideas (also incorporating some ideas from our latest TDNN-F models with resnet-style skip connections) and he has now been able to match our latest TDNN-F numbers with a DFSMN-style system. I hope with more tuning he will get even better results. If it seems to outperform TDNN-F more generally, we will definitely switch to more DFSMN-type models. |
Hi,@tramphero. After change the filename and relative path, I got some error log as below: [ Stack-Trace: ] WARNING (nnet-train-fsmn-streams[5.4.164 |
"ERROR (nnet-train-fsmn-streams[5.4.164~1-43822]:PosteriorToMatrix():posterior.cc:484) Out-of-bound Posterior element with index 5799, higher than number of columns 5777"
You need to modify the output layer of the net.proto to match the number of modeling units.
…------------------------------------------------------------------
发件人:zhuanaa <notifications@github.com>
发送时间:2020年2月25日(星期二) 10:03
收件人:kaldi-asr/kaldi <kaldi@noreply.github.com>
抄 送:张仕良(谵良) <sly.zsl@alibaba-inc.com>; Mention <mention@noreply.github.com>
主 题:Re: [kaldi-asr/kaldi] WIP: add DFSMN related codes and example scripts (#2437)
Hi,@tramphero. After change the filename and relative path, I got some error log as below:
nnet-train-fsmn-streams --cross-validate=true --randomize=false --verbose=0 --minibatch-size=4096 --feature-transform=exp/tri7b_DFSMN_M/final.feature_transform "ark:copy-feats scp:exp/tri7b_DFSMN_M/cv.scp ark:- | apply-cmvn --norm-means=true --norm-vars=false --utt2spk=ark:data_fbank/dev_clean/utt2spk scp:data_fbank/dev_clean/cmvn.scp ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- | append-ivector-to-feats --online-ivector-period=10 ark:- 'scp:exp/nnet3_cleaned/ivectors_train_960_dev_hires/ivector_online.scp' ark:- |" 'ark:ali-to-pdf exp/tri6b_cleaned_ali_train_960_cleaned/final.mdl "ark:gunzip -c exp/tri6b_cleaned_ali_dev_clean/ali..gz |" ark:- | ali-to-post ark:- ark:- |' exp/tri7b_DFSMN_M/nnet.init
LOG (nnet-train-fsmn-streams[5.4.1641-43822]:SelectGpuId():cu-device.cc:191) CUDA setup operating under Compute Exclusive Mode.
LOG (nnet-train-fsmn-streams[5.4.1641-43822]:FinalizeActiveGpu():cu-device.cc:247) The active GPU is [0]: Tesla P40 free:22688M, used:231M, total:22919M, free/total:0.989921 version 6.1
copy-feats scp:exp/tri7b_DFSMN_M/cv.scp ark:-
add-deltas --delta-order=2 ark:- ark:-
append-ivector-to-feats --online-ivector-period=10 ark:- scp:exp/nnet3_cleaned/ivectors_train_960_dev_hires/ivector_online.scp ark:-
apply-cmvn --norm-means=true --norm-vars=false --utt2spk=ark:data_fbank/dev_clean/utt2spk scp:data_fbank/dev_clean/cmvn.scp ark:- ark:-
LOG (nnet-train-fsmn-streams[5.4.1641-43822]:Init():nnet-randomizer.cc:32) Seeding by srand with : 777
LOG (nnet-train-fsmn-streams[5.4.1641-43822]:main():nnet-train-fsmn-streams.cc:163) CROSS-VALIDATION STARTED
ali-to-pdf exp/tri6b_cleaned_ali_train_960_cleaned/final.mdl 'ark:gunzip -c exp/tri6b_cleaned_ali_dev_clean/ali..gz |' ark:-
ali-to-post ark:- ark:-
ERROR (nnet-train-fsmn-streams[5.4.164~1-43822]:PosteriorToMatrix():posterior.cc:484) Out-of-bound Posterior element with index 5799, higher than number of columns 5777
[ Stack-Trace: ]
nnet-train-fsmn-streams() [0x767010]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
void kaldi::PosteriorToMatrix(std::vector<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > >, std::allocator<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > > > > const&, int, kaldi::Matrix)
void kaldi::nnet1::PosteriorToMatrix(std::vector<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > >, std::allocator<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > > > > const&, int, kaldi::CuMatrix)
kaldi::nnet1::Xent::Eval(kaldi::VectorBase const&, kaldi::CuMatrixBase const&, std::vector<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > >, std::allocator<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > > > > const&, kaldi::CuMatrix*)
main
__libc_start_main
nnet-train-fsmn-streams() [0x580009]
WARNING (nnet-train-fsmn-streams[5.4.1641-43822]:Close():kaldi-io.cc:515) Pipe ali-to-pdf exp/tri6b_cleaned_ali_train_960_cleaned/final.mdl "ark:gunzip -c exp/tri6b_cleaned_ali_dev_clean/ali.*.gz |" ark:- | ali-to-post ark:- ark:- | had nonzero return status 36096
WARNING (nnet-train-fsmn-streams[5.4.1641-43822]:Close():kaldi-io.cc:515) Pipe copy-feats scp:exp/tri7b_DFSMN_M/cv.scp ark:- | apply-cmvn --norm-means=true --norm-vars=false --utt2spk=ark:data_fbank/dev_clean/utt2spk scp:data_fbank/dev_clean/cmvn.scp ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- | append-ivector-to-feats --online-ivector-period=10 ark:- 'scp:exp/nnet3_cleaned/ivectors_train_960_dev_hires/ivector_online.scp' ark:- | had nonzero return status 36096
Have you come across this? Thank you for help me?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
have fixed it according to #2530 |
The implementation of DFSMN (https://arxiv.org/pdf/1803.05030.pdf) in Nnet1.
I have evaluated three configurations of DFSMN (DFSMN_S, DFSMN_M, DFSMN_L) with different model size in the librispeech task. The detailed experimental results are as shown in the RESULTS file.