Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add DFSMN related codes and example scripts #2437

Closed
wants to merge 16 commits into from

Conversation

tramphero
Copy link

@tramphero tramphero commented May 21, 2018

The implementation of DFSMN (https://arxiv.org/pdf/1803.05030.pdf) in Nnet1.
I have evaluated three configurations of DFSMN (DFSMN_S, DFSMN_M, DFSMN_L) with different model size in the librispeech task. The detailed experimental results are as shown in the RESULTS file.

Copy link
Contributor

@danpovey danpovey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.
I don't think there is a good chance that I will want to commit this, as it's a lot of extra code, it uses the nnet1 framework which is kind of deprecated, and it's not really clear that DFSMN has an advantage over the current TDNN-F scripts of the 'chain' models (which are presumably much faster to train and test).
But it's definitely interesting work; @GaofengCheng has been experimenting with things based on this.

@@ -0,0 +1,231 @@
// featbin/append-ivector-to-feats.cc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file is probably not necessary; the nnet2 scripts already do this task based on existing binaries.

@@ -345,16 +345,7 @@ template<typename Real> void CuVectorUnitTestSum() {
A.SetRandn();
ones.Set(1.0);

Real x = VecVec(A, ones);
Real y = A.Sum();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phis and a couple of other files seem to have some changes that are reversions of recent changes to master. Possibly a sign of some weirdness in your use of git?

//// FSMN kernel functions ///////
////////////////////////////////////////////////////

// forward operation in memory block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to commit this stuff I'd probably want these to be non-member functions as they are pretty specialized-- maybe in cu-math.h. But see my overall review.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the review and suggestion. I am agree that add these DFSMN specialized kernel function is conflict with the code style of kaldi. We will try to think of some alternative methods although it may not efficient as the CUDA kernel function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just saying they should be non-member functions, they would still be CUDA kernels.
But that was a minor point, and fixing it would likely not make me want to merge this.
My problem is a little more fundamental-- it's that there is a lot of code, it's based on a setup that is basically deprecated, and it provides no clear advantage versus our current best models based on TDNN-F.

@KarelVesely84
Copy link
Contributor

Hi, sorry for not reacting sooner. I like the idea of a having a new 'nnet1' model that will be better than the current one... For the moment, I am busy with other project, but I'll put a note to return to this later... Thanks, K.

linearity_.Resize(OutputDim(), InputDim());
RandGauss(0.0, param_stddev, &linearity_);
// if Xavier_flag=1, use the “Xavier” initialization
if(xavier_flag){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this 'special' initialization hard-coded?

  1. Xavier Glorot's initialization was for the networks only.
  2. It is possible to set the mean/variance of the initialization externally.
    Was it helpful in some way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Xavier Glorot's initialization is suitable for networks with ReLU activation. In our previous works (https://pdfs.semanticscholar.org/d39f/468f285fd4034bceaa37b9f3b7e2285f99fe.pdf), we found that ReLU-DNN is more sensitive to dynamic range of initial weights and the scaled-Xavier Glorot's initialization (by adding a beta=0.5) provides a good initialization.
  2. Setting the mean/variance of the initialization externally is OK. But for NN with different hidden layer size we need to calculate the dynamic range and then set the Gaussian variance.

@KarelVesely84 KarelVesely84 changed the title add DFSMN related codes and example scripts WIP: add DFSMN related codes and example scripts Jul 24, 2018
Modify the ivector folder name
revise proto dir
@danpovey
Copy link
Contributor

danpovey commented Aug 7, 2018

Guys, I just wanted to mention that @GaofengCheng has been working on various DFSMN-related ideas (also incorporating some ideas from our latest TDNN-F models with resnet-style skip connections) and he has now been able to match our latest TDNN-F numbers with a DFSMN-style system. I hope with more tuning he will get even better results. If it seems to outperform TDNN-F more generally, we will definitely switch to more DFSMN-type models.

@danpovey danpovey closed this Dec 13, 2018
@zhuanaa
Copy link

zhuanaa commented Feb 25, 2020

Hi,@tramphero. After change the filename and relative path, I got some error log as below:
nnet-train-fsmn-streams --cross-validate=true --randomize=false --verbose=0 --minibatch-size=4096 --feature-transform=exp/tri7b_DFSMN_M/final.feature_transform "ark:copy-feats scp:exp/tri7b_DFSMN_M/cv.scp ark:- | apply-cmvn --norm-means=true --norm-vars=false --utt2spk=ark:data_fbank/dev_clean/utt2spk scp:data_fbank/dev_clean/cmvn.scp ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- | append-ivector-to-feats --online-ivector-period=10 ark:- 'scp:exp/nnet3_cleaned/ivectors_train_960_dev_hires/ivector_online.scp' ark:- |" 'ark:ali-to-pdf exp/tri6b_cleaned_ali_train_960_cleaned/final.mdl "ark:gunzip -c exp/tri6b_cleaned_ali_dev_clean/ali..gz |" ark:- | ali-to-post ark:- ark:- |' exp/tri7b_DFSMN_M/nnet.init
LOG (nnet-train-fsmn-streams[5.4.1641-43822]:SelectGpuId():cu-device.cc:191) CUDA setup operating under Compute Exclusive Mode.
LOG (nnet-train-fsmn-streams[5.4.164
1-43822]:FinalizeActiveGpu():cu-device.cc:247) The active GPU is [0]: Tesla P40 free:22688M, used:231M, total:22919M, free/total:0.989921 version 6.1
copy-feats scp:exp/tri7b_DFSMN_M/cv.scp ark:-
add-deltas --delta-order=2 ark:- ark:-
append-ivector-to-feats --online-ivector-period=10 ark:- scp:exp/nnet3_cleaned/ivectors_train_960_dev_hires/ivector_online.scp ark:-
apply-cmvn --norm-means=true --norm-vars=false --utt2spk=ark:data_fbank/dev_clean/utt2spk scp:data_fbank/dev_clean/cmvn.scp ark:- ark:-
LOG (nnet-train-fsmn-streams[5.4.1641-43822]:Init():nnet-randomizer.cc:32) Seeding by srand with : 777
LOG (nnet-train-fsmn-streams[5.4.164
1-43822]:main():nnet-train-fsmn-streams.cc:163) CROSS-VALIDATION STARTED
ali-to-pdf exp/tri6b_cleaned_ali_train_960_cleaned/final.mdl 'ark:gunzip -c exp/tri6b_cleaned_ali_dev_clean/ali.
.gz |' ark:-
ali-to-post ark:- ark:-
ERROR (nnet-train-fsmn-streams[5.4.164~1-43822]:PosteriorToMatrix():posterior.cc:484) Out-of-bound Posterior element with index 5799, higher than number of columns 5777

[ Stack-Trace: ]
nnet-train-fsmn-streams() [0x767010]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
void kaldi::PosteriorToMatrix(std::vector<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > >, std::allocator<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > > > > const&, int, kaldi::Matrix)
void kaldi::nnet1::PosteriorToMatrix(std::vector<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > >, std::allocator<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > > > > const&, int, kaldi::CuMatrix
)
kaldi::nnet1::Xent::Eval(kaldi::VectorBase const&, kaldi::CuMatrixBase const&, std::vector<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > >, std::allocator<std::vector<std::pair<int, float>, std::allocator<std::pair<int, float> > > > > const&, kaldi::CuMatrix*)
main
__libc_start_main
nnet-train-fsmn-streams() [0x580009]

WARNING (nnet-train-fsmn-streams[5.4.1641-43822]:Close():kaldi-io.cc:515) Pipe ali-to-pdf exp/tri6b_cleaned_ali_train_960_cleaned/final.mdl "ark:gunzip -c exp/tri6b_cleaned_ali_dev_clean/ali.*.gz |" ark:- | ali-to-post ark:- ark:- | had nonzero return status 36096
WARNING (nnet-train-fsmn-streams[5.4.164
1-43822]:Close():kaldi-io.cc:515) Pipe copy-feats scp:exp/tri7b_DFSMN_M/cv.scp ark:- | apply-cmvn --norm-means=true --norm-vars=false --utt2spk=ark:data_fbank/dev_clean/utt2spk scp:data_fbank/dev_clean/cmvn.scp ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- | append-ivector-to-feats --online-ivector-period=10 ark:- 'scp:exp/nnet3_cleaned/ivectors_train_960_dev_hires/ivector_online.scp' ark:- | had nonzero return status 36096
Have you come across this? Thank you for help me?

@tramphero
Copy link
Author

tramphero commented Feb 25, 2020 via email

@zhuanaa
Copy link

zhuanaa commented Feb 25, 2020

have fixed it according to #2530
thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants