Bi-Directional RNN support #66

SeanNaren · 2016-04-14T09:06:18Z

Re-opened request on a new branch.

Contains an implementation of BRNN for the LSTM module based on the BiSequencer module here.

Let me know of any feedback/alterations!

SeanNaren · 2016-04-14T10:50:35Z

@elbamos I've opened a PR here but there is a merge with RNN happening here. Would be nice to get BRNN support for the LSTM in RNN, hopefully @jcjohnson agrees!

elbamos · 2016-04-14T17:13:30Z

Cool! I just hope it doesn't make rnn even more complicated.

On Apr 14, 2016, at 6:50 AM, Sean Naren notifications@github.com wrote:

@elbamos I've opened a PR here but there is a merge with RNN happening here. Would be nice to get BRNN support for the LSTM in RNN, hopefully @jcjohnson agrees!

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub

jcjohnson · 2016-04-14T17:18:27Z

ReverseSequence.lua

+    -- reverse output
+    local k = 1
+    for i = input:size(1), 1, -1 do
+        self.output[k] = input[i]


I'm worried that an explicit loop for reversing will be very slow for CudaTensors, since each execution of this line will require device synchronization and a kernel launch. Can you reverse the sequence in a vectorized way to avoid this overhead?

jcjohnson · 2016-04-14T17:23:09Z

@SeanNaren Sorry I've been slow to respond to this - I agree that bidirectional RNNs are a good thing that we should support.

Overall the BRNN and ReverseSequence API looks good to me, but I have some inline comments.

SeanNaren · 2016-04-14T17:26:23Z

No worries, thanks for the comments will work through em, all seem very valid points! I agree with loop, wasn't really a fan of it, so will try to find a better solution to it. Any suggestions would be awesome!

SeanNaren · 2016-04-14T18:09:08Z

@jcjohnson Added a different version of reversing the sequence using this response. How does it look now?

For clearState it seems to be in the nn.Container and I add the modules to the container so hopefully that will handle it. Let me know of any more feedback or if I missed anything!

jcjohnson · 2016-04-14T18:26:48Z

@SeanNaren The solution using narrow is pretty much the same as before, just using a different syntax; each call to copy will still invoke a separate CUDA kernel. I haven't thought through the details, but I think you should be able to reverse the sequence all at once using a single call to gather, which will probably be much faster.

Other than that it's looking good. Can you add some unit tests to make sure everything works as expected?

SeanNaren · 2016-04-14T19:44:38Z

@jcjohnson Thanks for your suggestion helped a lot! I've pushed a change that uses gather, let me know of any feedback you may have on it, will add some unit tests now.

SeanNaren · 2016-04-14T22:05:08Z

Added tests, let me know if there is any more issues/feedback.

fmassa · 2016-04-15T09:49:59Z

ReverseSequence.lua

+    for x = 1, T do
+        indices:narrow(1, x, 1):fill(T - x + 1)
+    end
+    self.gradInput = gradOutput:gather(1, indices)


this allocates new memory at every function call. Could you do something like

self.gradInput:gather(gradOutput, 1, indices)

instead ?

fmassa · 2016-04-15T09:59:27Z

I have some comments on the code (mostly minor), but I commented on two things that I think deserves some attention. But overall I think this is nice :)

SeanNaren · 2016-04-15T10:29:01Z

@fmassa Thanks a lot man! Lots of good fixes cheers. I think I've covered all the issues, let me know if you have any more feedback.

yfliao · 2016-06-17T14:29:09Z

Dear SeanNaren:

How to apply BRNN? The command

$th train.lua -input_h5 my_data.h5 -input_json my_data.json -model_type brnn
Running with CUDA on GPU 0
/home/liao/torch/install/bin/luajit: ./LanguageModel.lua:46: attempt to index local 'rnn' (a nil value)
stack traceback:
./LanguageModel.lua:46: in function '__init'
/home/liao/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/liao/torch/install/share/lua/5.1/torch/init.lua:87>
[C]: in function 'LanguageModel'
train.lua:90: in main chunk
[C]: in function 'dofile'
...liao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405d50

ostrosablin · 2016-06-29T19:47:54Z

How to integrate it into train.lua? New code doesn't seem to be plugged into existing torch-rnn facilities.

Also, how they behave in practice? According to papers, they're good at certain kinds of tasks and have better loss than other types of RNN. But how they compare with, say, LSTM on character-level language modelling? Are they producing better or more interesting results?

SeanNaren added 5 commits April 14, 2016 10:00

Added BRNN module description

8614c10

Added ReverseSequence module for BRNN

5e30c1d

Added BRNN module

7d1f659

Updated init

e14990d

Clarified expected input shape

73f3aa9

SeanNaren mentioned this pull request Apr 14, 2016

SeqLSTM Element-Research/rnn#207

Merged

jcjohnson reviewed Apr 14, 2016
View reviewed changes

SeanNaren added 3 commits April 14, 2016 18:33

Require ReverseSequence outside of init

75b6a6e

Changed method of reversal

52fd5fe

Merge remote-tracking branch 'origin/master'

90af6df

Changed method of reversal to use gather

24856a6

SeanNaren added 2 commits April 14, 2016 23:04

Added ReverseSequenceTest

1c95052

Added BRNN test

a244cb2

fmassa reviewed Apr 15, 2016
View reviewed changes

SeanNaren added 3 commits April 15, 2016 11:13

Removed allocating new mem on forward/backward

41f4956

Added simple backwards test

faf294c

Added gradIndices and outputIndices to init for type check

50d1a13

Removed moving to prev dir

06a5ce1

SeanNaren closed this Jun 1, 2017

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bi-Directional RNN support #66

Bi-Directional RNN support #66

SeanNaren commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

elbamos commented Apr 14, 2016

jcjohnson Apr 14, 2016

jcjohnson commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

jcjohnson commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

fmassa Apr 15, 2016

fmassa commented Apr 15, 2016

SeanNaren commented Apr 15, 2016

yfliao commented Jun 17, 2016 •

edited

Loading

ostrosablin commented Jun 29, 2016

Bi-Directional RNN support #66

Bi-Directional RNN support #66

Conversation

SeanNaren commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

elbamos commented Apr 14, 2016

jcjohnson Apr 14, 2016

Choose a reason for hiding this comment

jcjohnson commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

jcjohnson commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

SeanNaren commented Apr 14, 2016

fmassa Apr 15, 2016

Choose a reason for hiding this comment

fmassa commented Apr 15, 2016

SeanNaren commented Apr 15, 2016

yfliao commented Jun 17, 2016 • edited Loading

ostrosablin commented Jun 29, 2016

yfliao commented Jun 17, 2016 •

edited

Loading