Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CudnnLSTM dropout takes no effect #6466

Closed
robotnc opened this issue Dec 23, 2016 · 13 comments
Closed

CudnnLSTM dropout takes no effect #6466

robotnc opened this issue Dec 23, 2016 · 13 comments
Assignees
Labels

Comments

@robotnc
Copy link

robotnc commented Dec 23, 2016

My environment is: Tensorflow 0.11.0rc2, in unbuntu 16.04, cuda8.0, cudnn5.1, GPU is GTX1080.

I am using the CudnnLSTM from tensorflow.contrib.cudnn_rnn package. I found that the dropout setting in CudnnLSTM seems take no effect, and I checked that there is no test for dropout in op unit test. So I write a simple code to test it, the code is below:

import tensorflow as tf
from tensorflow.contrib.cudnn_rnn import CudnnLSTM

class Cudnn_model():
  def __init__(self,dropout):
    self.model = CudnnLSTM(
        num_layers = 1,
        num_units = 8,
        input_size = 8,
        input_mode = "skip_input",
        direction = "unidirectional",
        dropout = dropout,
        )

    params_size_t = self.model.params_size()
    self.params = tf.Variable(tf.ones([params_size_t]), validate_shape=False)

  def run_step(self,rnn_inputs):
    outputs, output_h, output_c = self.model(
                input_data = rnn_inputs,
                input_h = tf.zeros([1,1,8]),
                input_c = tf.zeros([1,1,8]),
                params = self.params,
                is_training= True
               )
    self.outputs = outputs
    return outputs

def main():

inputs = tf.pack([[[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0]]])
m1 = Cudnn_model(dropout = 0.0)
output1 = m1.run_step(inputs)
m2 = Cudnn_model(dropout = 0.5)
output2 = m2.run_step(inputs)
output3 = tf.nn.dropout(output1,0.5)
output4 = m1.run_step(tf.nn.dropout(inputs,0.5))

config = tf.ConfigProto(allow_soft_placement=True)
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
sess.run(tf.initialize_all_variables())

for i in range(5):
    out1,out2,out3,out4 = sess.run([output1,output2,output3,output4])
    print " ----- Try time %d -----" % i
    print "cndnn_dropout=0 : ", out1
    print "cudnn_dropout=0.5 : ", out2
    print "tf_out_dropout=0.5 : ", out3
    print "tf_in_dropout=0.5 : ", out4
return

And the result is:

----- Try time 0 -----
cndnn_dropout=0 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
cudnn_dropout=0.5 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
tf_out_dropout=0.5 [[[ 1.2165668   0.          0.          0.          0.          1.52103424
1.52239561  1.52289677]]]
tf_in_dropout=0.5 [[[ 0.6082834   0.6082834   0.6082834   0.76119781  0.6082834   0.76158684
0.6082834   0.6082834 ]]]

 ----- Try time 1 -----
cndnn_dropout=0 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
cudnn_dropout=0.5 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
tf_out_dropout=0.5 [[[ 0.  0.  0.  0.  0.  0.  0.  0.]]]
tf_in_dropout=0.5 [[[ 0.6082834   0.74009657  0.6082834   0.76119781  0.6082834   0.76158684
0.6082834   0.6082834 ]]]

 ----- Try time 2 -----
cndnn_dropout=0 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
cudnn_dropout=0.5 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
tf_out_dropout=0.5 [[[ 1.2165668   0.          0.          1.50730526  1.51733613  1.52103424
1.52239561  0.        ]]]
tf_in_dropout=0.5 [[[ 0.6082834   0.74009657  0.6082834   0.76119781  0.6082834   0.76158684
0.6082834   0.761594  ]]]

 ----- Try time 3 -----
cndnn_dropout=0 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
cudnn_dropout=0.5 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
tf_out_dropout=0.5 [[[ 0.          0.          1.48019314  0.          0.          0.
1.52239561  1.52289677]]]
tf_in_dropout=0.5 [[[ 0.6082834   0.6082834   0.6082834   0.76119781  0.76154053  0.6082834
0.76159316  0.761594  ]]]

 ----- Try time 4 -----
cndnn_dropout=0 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
cudnn_dropout=0.5 [[[ 0.6082834   0.70377535  0.74009657  0.75365263  0.75866807  0.76051712
0.76119781  0.76144838]]]
tf_out_dropout=0.5 [[[ 0.          1.40755069  0.          1.50730526  1.51733613  0.
1.52239561  1.52289677]]]
tf_in_dropout=0.5 [[[ 0.6082834   0.74009657  0.75866807  0.76119781  0.6082834   0.76158684
0.76159316  0.6082834 ]]]

From the result I see that the cudnn_dropout = 0.5 takes no effect, the result is always same with cudnn_dropout = 0.0.

@robotnc robotnc changed the title CudnnLSTM dropout take no effect CudnnLSTM dropout takes no effect Dec 23, 2016
@michaelisard michaelisard added the type:bug Bug label Jan 5, 2017
@michaelisard michaelisard assigned zhangyaobit and unassigned zheng-xq Jan 5, 2017
@zhangyaobit
Copy link

@robotnc Sorry for my late response; I was on vacation. Yes, dropout is not supported yet, see here.

Adding @zheng-xq

@alquraishi
Copy link

If we do get dropout support, can we get recurrent dropout, as that's the form that seems to actually help. Similar to what's in LayerNormBasicLSTMCell.

@zhangyaobit zhangyaobit added the stat:contribution welcome Status - Contributions welcome label Feb 16, 2017
@zhangyaobit zhangyaobit removed their assignment Feb 27, 2017
@fxsuper
Copy link

fxsuper commented Apr 21, 2017

Any updates on this? Not having any dropout support for cuDNN-based RNNs seems really limiting, especially considering how much further ahead PyTorch support is for this. This isn't exactly a new or rarely used feature.

@alquraishi
Copy link

What's actually involved in adding (input) dropout support? Is it just wiring the places in this file which are marked with /*dropout*/?

@vrv vrv removed the stat:contribution welcome Status - Contributions welcome label Apr 22, 2017
@zheng-xq zheng-xq assigned protoget and unassigned zheng-xq Apr 23, 2017
@robotnc
Copy link
Author

robotnc commented Apr 23, 2017 via email

@robotnc robotnc closed this as completed Apr 23, 2017
@alquraishi
Copy link

alquraishi commented Apr 23, 2017

My experience is very different. I still see a 3x gap between FusedBlockLSTM and CudnnLSTM, so I would still think this is very useful to have. And the issue is that the feature is present as an option, but it doesn't actually do anything. So it's very misleading as it is.

@robotnc
Copy link
Author

robotnc commented Apr 23, 2017 via email

@alquraishi
Copy link

alquraishi commented Apr 23, 2017

TF 1.1rc2. And like I mentioned there's not just a performance difference but a bona fide bug, because the CudnnLSTM API exposes a dropout option that does not do anything. If you don't mind reopen the ticket. Otherwise I'll start a new ticket.

FYI this is on a Pascal Titan X with a bidirectional LSTM of 800 units (each way) and 700 timesteps.

@robotnc
Copy link
Author

robotnc commented Apr 23, 2017

The dropout is still needed, reopen again.

@robotnc robotnc reopened this Apr 23, 2017
@protoget
Copy link
Member

protoget commented May 6, 2017

CudnnRNN dropout change is submitted, please keep an eye on the nightly builds.

@skye
Copy link
Member

skye commented Jun 16, 2017

@protoget has this been resolved?

@alquraishi
Copy link

@protoget I see your commit last May, but I just tried again with TF 1.4.0rc0 and dropout still doesn't seem to do anything.

@protoget
Copy link
Member

@skye @alquraishi
The dropout is supported.
Dropout is applied between layers -- if you only have one layer, it doesn't show up even if dropout ratio isn't zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants