adding dropout-by row #8

GaofengCheng · 2016-12-16T11:08:48Z

here existing problem as follow:

LOG (nnet3-chain-train:IsComputeExclusive():cu-device.cc:258) CUDA setup operating under Compute Exclusive Mode.
LOG (nnet3-chain-train:FinalizeActiveGpu():cu-device.cc:225) The active GPU is [4]: Tesla K80   free:11383M, used:135M, total:11519M, free/total:0.988222 version 3.7
nnet3-am-copy --raw=true --learning-rate=0.002 exp/sdm1/chain_cleaned/tdnn_lstm1i_4epoch_dp_test21_byrow_sp_bi_ihmali_ld5/0.mdl -
nnet3-copy '--edits=set-dropout-proportion name=* proportion=0.0' - -
LOG (nnet3-am-copy:main():nnet3-am-copy.cc:149) Copied neural net from exp/sdm1/chain_cleaned/tdnn_lstm1i_4epoch_dp_test21_byrow_sp_bi_ihmali_ld5/0.mdl to raw format as -
LOG (nnet3-copy:ReadEditConfig():nnet-utils.cc:718) Set dropout proportions for 3 nodes.
ERROR (nnet3-chain-train:ExpectToken():io-funcs.cc:201) Expected token "</DropoutComponent>", got instead "<DropoutPerFrame>".

[ Stack-Trace: ]
nnet3-chain-train() [0xb776e0]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::ExpectToken(std::istream&, bool, char const*)
kaldi::nnet3::DropoutComponent::Read(std::istream&, bool)
kaldi::nnet3::Component::ReadNew(std::istream&, bool)
kaldi::nnet3::Nnet::Read(std::istream&, bool)
void kaldi::ReadKaldiObject<kaldi::nnet3::Nnet>(std::string const&, kaldi::nnet3::Nnet*)
main
__libc_start_main
nnet3-chain-train() [0x7d21e9]

WARNING (nnet3-chain-train:Close():kaldi-io.cc:501) Pipe nnet3-am-copy --raw=true --learning-rate=0.002 exp/sdm1/chain_cleaned/tdnn_lstm1i_4epoch_dp_test21_byrow_sp_bi_ihmali_ld5/0.mdl - | nnet3-copy --edits='set-dropout-proportion name=* proportion=0.0'             - - | had nonzero return status 36096

# Accounting: time=2 threads=1
# Finished at Fri Dec 16 18:52:29 CST 2016 with status 255

vimalmanohar · 2016-12-16T14:00:32Z

src/nnet3/nnet-simple-component.cc

@@ -154,6 +186,8 @@ void DropoutComponent::Read(std::istream &is, bool binary) {
  ReadBasicType(is, binary, &dim_);
  ExpectToken(is, binary, "<DropoutProportion>");
  ReadBasicType(is, binary, &dropout_proportion_);
+  ExpectToken(is, binary, "<DropoutPerFrame>");


Make this backcompatible. Change this to ReadToken and then add an if condition to check which token is present.
See other components where ReadToken is used.

@GaofengCheng, you need to understand what Vimal was saying here- there needs to be back compatibility code for the old format. Search for ReadToken() in the file for examples.

However, the reason for your error is that you need to recompile in 'chainbin/' (and possibly chain/').

.. and, of course, make this back compatible.

vimalmanohar · 2016-12-16T14:02:11Z

src/nnet3/nnet-utils.h

@@ -233,7 +233,7 @@ void FindOrphanNodes(const Nnet &nnet, std::vector<int32> *nodes);
       remove internal nodes directly; instead you should use the command
       'remove-orphans'.

-    set-dropout-proportion [name=<name-pattern>] proportion=<dropout-proportion>
+    set-dropout-proportion [name=<name-pattern>] proportion=<dropout-proportion> perframe=<perframe>


This should probably be made per-frame.

vimalmanohar · 2016-12-16T14:02:34Z

src/nnet3/nnet-utils.cc

      if (!config_line.GetValue("proportion", &proportion)) {
        KALDI_ERR << "In edits-config, expected proportion to be set in line: "
                  << config_line.WholeLine();
      }
+      if (!config_line.GetValue("perframe", &perframe)) {
+        perframe = false;


per_frame is more appropriate name

vimalmanohar · 2016-12-16T14:05:15Z

src/nnet3/nnet-simple-component.cc

@@ -119,16 +131,36 @@ void DropoutComponent::Propagate(const ComponentPrecomputedIndexes *indexes,

  BaseFloat dropout = dropout_proportion_;
  KALDI_ASSERT(dropout >= 0.0 && dropout <= 1.0);
+  if(dropout_per_frame_ == true)


if (dropout_per_frame_) {

vimalmanohar · 2016-12-16T14:05:57Z

src/nnet3/nnet-simple-component.cc

-  out->ApplyHeaviside(); // apply the function (x>0?1:0).  Now, a proportion "dropout" will
-                         // be zero and (1 - dropout) will be 1.0.
+    out->MulElements(in);
+  }


} else {
is used everywhere else.

Change every other place too.

@danpovey using CopyColFromVec to realize matrix random by row, the speed is rarely slow, any way to accelerate this? hope to get some instrction

very slow...

vimalmanohar · 2016-12-16T14:09:07Z

src/nnet3/nnet-utils.cc

@@ -524,12 +524,14 @@ std::string NnetInfo(const Nnet &nnet) {
 }

 void SetDropoutProportion(BaseFloat dropout_proportion,
+                          bool dropout_per_frame ,


No space before ,

vimalmanohar · 2016-12-16T14:09:53Z

src/nnet3/nnet-utils.cc

                          Nnet *nnet) {
+  dropout_per_frame = false;


Why is the input to the function ignored?

and fix some issues

GaofengCheng · 2016-12-17T14:33:44Z

src/cudamatrix/cu-kernels.cu

+      mat[index] = mat[index-d.stride-d.cols];
+    }
+}
+


@danpovey I think there may exist some problem:

LOG (nnet3-chain-train:UpdateParamsWithMaxChange():nnet-chain-training.cc:225) Per-component max-change active on 19 / 35 Updatable Components.(smallest factor=0.223659 on tdnn2.affine with max-change=0.75). Global max-change factor was 0.495221 with max-change=2. ERROR (nnet3-chain-train:MulElements():cu-matrix.cc:665) cudaError_t 77 : "an illegal memory access was encountered" returned from 'cudaGetLastError()' [ Stack-Trace: ] nnet3-chain-train() [0xb78566] kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*) kaldi::MessageLogger::~MessageLogger() kaldi::CuMatrixBase<float>::MulElements(kaldi::CuMatrixBase<float> const&) kaldi::nnet3::DropoutComponent::Propagate(kaldi::nnet3::ComponentPrecomputedIndexes const*, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float>*) const kaldi::nnet3::NnetComputer::ExecuteCommand(int) kaldi::nnet3::NnetComputer::Forward() kaldi::nnet3::NnetChainTrainer::Train(kaldi::nnet3::NnetChainExample const&) main __libc_start_main nnet3-chain-train() [0x7d23c9]

This is probably because of not using ==.
I think your code will not work in any case. It may require some sync_threads after first setting the values when j == j_tempt
and then copy them for j != j_tempt.

vimalmanohar · 2016-12-17T19:30:45Z

src/cudamatrix/cu-kernels.cu

+static void _apply_heaviside_by_row(Real* mat, MatrixDim d) {
+  int i = blockIdx.x * blockDim.x + threadIdx.x;  // col index
+  int j = blockIdx.y * blockDim.y + threadIdx.y;  // row index
+  int j_tempt = blockIdx.y * blockDim.y + threadIdx.y;  // row index using to control setting heavyside() in the first rows


Did you want to get the 0th row or something? You have given the same expression as j.

vimalmanohar · 2016-12-17T19:30:58Z

src/cudamatrix/cu-kernels.cu

+  int j_tempt = blockIdx.y * blockDim.y + threadIdx.y;  // row index using to control setting heavyside() in the first rows
+  int index = i + j * d.stride;
+  if (i < d.cols && j < d.rows)
+    if (j = j_tempt) {


vimalmanohar · 2016-12-17T19:34:17Z

src/cudamatrix/cu-kernels.cu

+      mat[index] = mat[index-d.stride-d.cols];
+    }
+}
+


This is probably because of not using ==.
I think your code will not work in any case. It may require some sync_threads after first setting the values when j == j_tempt
and then copy them for j != j_tempt.

vimalmanohar · 2016-12-17T19:52:08Z

egs/wsj/s5/steps/libs/nnet3/train/common.py

@@ -771,6 +771,11 @@ def __init__(self):
                                 lstm*=0,0.2,0'.  More general should precede
                                 less general patterns, as they are applied
                                 sequentially.""")
+        self.parser.add_argument("--trainer.dropout-per-frame", type=str,


Is this option required? Do you expect to change whether dropout is per frame or not during the training iterations?
I think dropout-per-frame should only be at the config level.
Also I think you can remove dropout_per_frame from the function SetDropoutProportion, because that is something you would have already defined from the config. If you really need to change dropout-per-frame during training, I suggest add a separate function like SetDropoutPerFrame to the DropoutComponent.

danpovey · 2016-12-17T21:16:01Z

Regarding the way to specify the per-frame dropout, I agree with Vimal that this should be at the config level. BTW, you could consider (eventually, maybe) a combination of per-frame and regular dropout, with a config like dropout-per-frame-proportion=x with 0 <= x <= 1, and then you use both types of dropout. @GaofengCheng, I think you misunderstood what I said. You say " CopyColFromVec to realize matrix random by row, the speed is really slow". I said CopyColsFromVec-- notice the s. It is a different function, and it will be fast. I think a lot of the code changes you made might be unnecessary.

…

On Sat, Dec 17, 2016 at 11:52 AM, Vimal Manohar ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/wsj/s5/steps/libs/nnet3/train/common.py <#8 (review)>: > @@ -771,6 +771,11 @@ def __init__(self): lstm*=0,0.2,0'. More general should precede less general patterns, as they are applied sequentially.""") + self.parser.add_argument("--trainer.dropout-per-frame", type=str, Is this option required? Do you expect to change whether dropout is per frame or not during the training iterations? I think dropout-per-frame should only be at the config level. Also I think you can remove dropout_per_frame from the function SetDropoutProportion, because that is something you would have already defined from the config. If you really need to change dropout-per-frame during training, I suggest add a separate function like SetDropoutPerFrame to the DropoutComponent. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu5wt69dE9fF9wscEV5itDUtytVnxks5rJD2SgaJpZM4LPFOn> .

danpovey · 2016-12-18T00:35:49Z

egs/ami/s5b/local/chain/tuning/run_tdnn_lstm_1i_dp.sh

@@ -243,6 +244,7 @@ if [ $stage -le 16 ]; then
    --egs.chunk-left-context $chunk_left_context \
    --egs.chunk-right-context $chunk_right_context \
    --trainer.dropout-schedule $dropout_schedule \
+    --trainer.dropout-per-frame $dropout_per_frame \


as Vimal says, please remove this from the training code... does not need to be there.

danpovey · 2016-12-18T00:36:27Z

src/cudamatrix/cu-kernels-ansi.h

@@ -64,6 +64,7 @@ void cudaF_apply_pow(dim3 Gr, dim3 Bl, float* mat, float power, MatrixDim d);
 void cudaF_apply_pow_abs(dim3 Gr, dim3 Bl, float* mat, float power,
                         bool include_sign, MatrixDim d);
 void cudaF_apply_heaviside(dim3 Gr, dim3 Bl, float* mat, MatrixDim d);
+void cudaF_apply_heaviside_by_row(dim3 Gr, dim3 Bl, float* mat, MatrixDim d);


there is no need for any of these changes in cudamatrix/... just use CopyColsFromVec.

danpovey · 2016-12-18T00:37:11Z

src/nnet3/nnet-combine.cc

@@ -34,7 +34,7 @@ NnetCombiner::NnetCombiner(const NnetCombineConfig &config,
    nnet_params_(std::min(num_nnets, config_.max_effective_inputs),
                 NumParameters(first_nnet)),
    tot_input_weighting_(nnet_params_.NumRows()) {
-  SetDropoutProportion(0, &nnet_);
+  SetDropoutProportion(0, false, &nnet_);


you can remove this 'false' argument from the function... just make it a fixed property of the component that can't be changed after you initialize.

danpovey · 2016-12-18T00:37:25Z

src/nnet3/nnet-simple-component.cc

@@ -87,27 +87,37 @@ void PnormComponent::Write(std::ostream &os, bool binary) const {
 }


-void DropoutComponent::Init(int32 dim, BaseFloat dropout_proportion) {
+void DropoutComponent::Init(int32 dim, BaseFloat dropout_proportion, bool dropout_per_frame) {


please watch line length (80-char limit)

danpovey · 2016-12-18T00:38:42Z

src/nnet3/nnet-simple-component.cc

  if (!ok || cfl->HasUnusedValues() || dim <= 0 ||
      dropout_proportion < 0.0 || dropout_proportion > 1.0)
    KALDI_ERR << "Invalid initializer for layer of type "
              << Type() << ": \"" << cfl->WholeLine() << "\"";
-  Init(dim, dropout_proportion);
+  if( ! ok2 )
+  {


you don't need a branch here because dropout_per_frame defaults to false if not set (that's how you
initialized the variable). Don't have the 'ok2' variable; you don't need to check the return status of
cfl->GetValue("dropout-per-frame", &dropout_per_frame);
because it is an optional parameter.

danpovey · 2016-12-18T00:42:25Z

src/nnet3/nnet-simple-component.cc

+  {
+    // This const_cast is only safe assuming you don't attempt
+    // to use multi-threaded code with the GPU.
+    const_cast<CuRand<BaseFloat>&>(random_generator_).RandUniform(out);


Here, you'll want to create a temporary vector with dimension equal to the num-rows in your 'in'/'out' matrices, and do the rand stuff on that, then you'll need CopyColsFromVec().

danpovey · 2016-12-18T00:42:43Z

src/nnet3/nnet-simple-component.cc

@@ -154,6 +186,8 @@ void DropoutComponent::Read(std::istream &is, bool binary) {
  ReadBasicType(is, binary, &dim_);
  ExpectToken(is, binary, "<DropoutProportion>");
  ReadBasicType(is, binary, &dropout_proportion_);
+  ExpectToken(is, binary, "<DropoutPerFrame>");


.. and, of course, make this back compatible.

danpovey · 2016-12-18T00:42:59Z

src/nnet3/nnet-simple-component.h

@@ -87,11 +87,11 @@ class PnormComponent: public Component {
 // "Dropout: A Simple Way to Prevent Neural Networks from Overfitting".
 class DropoutComponent : public RandomComponent {
 public:
-  void Init(int32 dim, BaseFloat dropout_proportion = 0.0);
+  void Init(int32 dim, BaseFloat dropout_proportion = 0.0, bool dropout_per_frame = false);


please watch line length.

line too long

danpovey · 2016-12-18T00:44:48Z

src/nnet3/nnet-simple-component.h

+  void SetDropoutProportion(BaseFloat dropout_proportion, bool dropout_per_frame) {
+     dropout_proportion_ = dropout_proportion;
+     dropout_per_frame_ = dropout_per_frame;
+      }


Odd indentation. Make sure you are not introducing tabs into the file.
If you use emacs, use

(setq-default tab-width 4) (setq-default fill-column 80) (setq-default indent-tabs-mode `nil) (add-hook 'write-file-hooks 'delete-trailing-whitespace) (load-file "~/.google-c-style.el") (add-hook 'c-mode-common-hook 'google-set-c-style) (add-hook 'c-mode-common-hook 'google-make-newline-indent)

You can find google-c-style.el from a web search.

@danpovey updating the new version, sorry for incorrect format, I'm using visual studio code/sublime, and so far I haven't found a format tool that could keep exactly the same as the way we are using in Kaldi (I have tried to format nnet-simple-component.cc, it changes a lot to the existing code, though it's also google style, ). I PR the nnet-simple-component.cc formated by
format tool under sublime.....

danpovey · 2016-12-18T00:45:18Z

src/nnet3/nnet-simple-component.h

  virtual std::string Info() const;

-  void SetDropoutProportion(BaseFloat dropout_proportion) { dropout_proportion_ = dropout_proportion; }
+  void SetDropoutProportion(BaseFloat dropout_proportion, bool dropout_per_frame) {


.. remove dropout_per_frame..

This reverts commit 1d22219.

This reverts commit 14662b6.

more experiments are on the way (different places etc...)

danpovey · 2016-12-21T20:25:48Z

egs/ami/s5b/local/chain/tuning/run_tdnn_lstm_1i_dp.sh

-#Final valid prob      -0.245208   -0.246
-#Final train prob (xent)      -1.47648  -1.54
-#Final valid prob (xent)      -2.16365  -2.10
+#System   tdnn_lstm1i_sp_bi_ihmali_ld5 tdnn_lstm1i_dp_sp_bi_ihmali_ld5


when you do a new experiment you should create a different letter/number combination, e.g. 1j, and use the 'compare_wer_general.sh' script or whatever it's called to compare with the baseline, if possible. please stay within the existing conventions for script naming.

... also, if the per-frame dropout turns out, in the end, not to be that useful, we might not want to check it into Kaldi. But let's see how your experiments turn out.

@danpovey it would be better if you could have a look at whether my nnet-simple-component.cc in PR has the right format.....

This reverts commit 463a4dc.

GaofengCheng added 3 commits December 16, 2016 16:06

add dropout by row

6548b55

now only support by row dropout

23ae730

revise

614a868

vimalmanohar reviewed Dec 16, 2016

View reviewed changes

GaofengCheng added 3 commits December 17, 2016 18:04

adding scripts level dropout-by-row code

c1d1ad1

and fix some issues

adding kernel heavybyrow

14662b6

add cuda kernel to realize random-matrix-by row

1d22219

GaofengCheng commented Dec 17, 2016

View reviewed changes

vimalmanohar reviewed Dec 17, 2016

View reviewed changes

danpovey reviewed Dec 18, 2016

View reviewed changes

GaofengCheng added 5 commits December 21, 2016 21:45

Revert "add cuda kernel to realize random-matrix-by row"

5b8b98b

This reverts commit 1d22219.

Revert "adding kernel heavybyrow"

4137c9d

This reverts commit 14662b6.

updating existing best scripts

d721e59

more experiments are on the way (different places etc...)

fix some bug and format

1e2adab

sublime tool to formate nnet-simple-component.cc

463a4dc

danpovey reviewed Dec 21, 2016

View reviewed changes

GaofengCheng mentioned this pull request Dec 23, 2016

Dropout schedule in nnet3 training scripts kaldi-asr/kaldi#1247

Closed

Revert "sublime tool to formate nnet-simple-component.cc"

d0290c3

This reverts commit 463a4dc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding dropout-by row #8

adding dropout-by row #8

GaofengCheng commented Dec 16, 2016

vimalmanohar Dec 16, 2016

danpovey Dec 16, 2016

danpovey Dec 18, 2016

vimalmanohar Dec 16, 2016

vimalmanohar Dec 16, 2016

vimalmanohar Dec 16, 2016

vimalmanohar Dec 16, 2016

vimalmanohar Dec 16, 2016

GaofengCheng Dec 17, 2016

GaofengCheng Dec 17, 2016

vimalmanohar Dec 16, 2016

vimalmanohar Dec 16, 2016

GaofengCheng Dec 17, 2016

vimalmanohar Dec 17, 2016

vimalmanohar Dec 17, 2016

vimalmanohar Dec 17, 2016

vimalmanohar Dec 17, 2016

vimalmanohar Dec 17, 2016

danpovey commented Dec 17, 2016 via email

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 18, 2016

danpovey Dec 22, 2016

danpovey Dec 18, 2016

GaofengCheng Dec 21, 2016

danpovey Dec 18, 2016

danpovey Dec 21, 2016

danpovey Dec 21, 2016

GaofengCheng Dec 22, 2016

adding dropout-by row #8

Are you sure you want to change the base?

adding dropout-by row #8

Conversation

GaofengCheng commented Dec 16, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Dec 17, 2016 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment