Added BRNN support #3

SeanNaren · 2016-04-11T19:27:53Z

Hey I've added BLSTM support, hopefully it looks good! I've had to expose some methods in RNN to inherit. Hopefully it's useful, let me know of any feedback.

borisfom · 2016-04-12T08:30:14Z

Looks good. Can you add some tests, too ?

SeanNaren · 2016-04-12T09:41:59Z

Good idea but not sure how to test the RNN script, is there a way that we could get the bias and weights of the gate? How would you suggest testing the two modules?

ngimel · 2016-04-12T15:45:06Z

cudnn provides cudnnGetRNNLinLayerMatrixParams and cudnnGetRNNLinLayerBiasParams functions for getting pointers to biases and weights.

SeanNaren · 2016-04-12T18:06:36Z

I'm trying to replicate the sample as to act as the test however I can't figure out how to create the double pointer required for the cudnnGetRNNLinLayerMatrixParams method for the last arg:

float *linLayerMat;

         cudnnErrCheck(cudnnGetRNNLinLayerMatrixParams( cudnnHandle,
                                                        rnnDesc,  
                                                        layer,
                                                        xDesc, 
                                                        wDesc, 
                                                        w,
                                                        linLayerID,  
                                                        linLayerMatDesc, 
                                                        (void**)&linLayerMat));

How would I create the double pointer? My attempt may also help:

local ptr = ffi.new("float*")

local outputDesc = rnn:createFilterDescriptors(1)
errcheck('cudnnGetRNNLinLayerMatrixParams',
    cudnn.getHandle(),
    rnn.rnnDesc[0],
    0,
    rnn.xDescs,
    rnn.wDesc[0],
    rnn.weight:data(),
    0,
    outputDesc[0],
    ptr) -- How to cast float* to void**?

ngimel · 2016-04-12T18:08:30Z

This stackoverflow question might help
http://stackoverflow.com/questions/14011598/how-to-pass-a-pointer-to-luajit-ffi-to-be-used-as-out-argument

SeanNaren · 2016-04-12T21:57:14Z

Thanks! Managed to cast it using ffi to get the pointer like such, but not sure how to access the actual data from the pointer. In the C RNN example all values in the matrix/bias are initialised to a set value, would I be able to somehow replicate that in lua?

 local linLayerMatDesc = rnn:createFilterDescriptors(1)
        local matrixPointer = ffi.new("float*[1]")
        errcheck('cudnnGetRNNLinLayerMatrixParams',
            cudnn.getHandle(),
            rnn.rnnDesc[0],
            layer,
            rnn.xDescs,
            rnn.wDesc[0],
            rnn.weight:data(),
            layerId,
            linLayerMatDesc[0],
            ffi.cast("void**", matrixPointer)) -- How to access data stored at this pointer?

ngimel · 2016-04-12T22:45:36Z

You'll have to get weight size from linLayerMatDesc[0], calculate offset as
offset = matrixPointer[0] - rnn.weight:data(), and then do something like

wTensor = torch.Tensor(rnn.weight:storage(), offset,size)
wTensor:fill(1)

SeanNaren · 2016-04-12T23:21:59Z

Awesome, and for the bias values would the same approach work? The command is nearly identical:

 local linLayerBiasDesc = rnn:createFilterDescriptors(1)
        local biasPointer = ffi.new("float*[1]")
        errcheck('cudnnGetRNNLinLayerBiasParams',
            cudnn.getHandle(),
            rnn.rnnDesc[0],
            layer,
            rnn.xDescs,
            rnn.wDesc[0],
            rnn.weight:data(),
            layerId,
            linLayerBiasDesc[0],
            ffi.cast("void**", biasPointer))

ngimel · 2016-04-13T01:29:02Z

Yes, this should work for biases too.

SeanNaren · 2016-04-13T14:29:15Z

Added a test for RNN that uses the checksums taken from the cudnn RNN C sample. The checksums for a few of the backward passes are not correct, any help would be great! Will try to fix them as well.

SeanNaren · 2016-04-13T17:32:30Z

The test now works for RELU/TANH/LSTM/GRU. One failing test (the weight checksum for ReLU is slightly off, not entirely sure why).

ngimel · 2016-04-13T17:39:36Z

Great, thanks!

SeanNaren · 2016-04-13T17:44:54Z

Just as a side note would it make more sense to add the LSTM/GRU RNNs as separate modules like a BLSTM?

ngimel · 2016-04-13T18:04:06Z

test/test_rnn.lua

+            local dataType = 'CUDNN_DATA_FLOAT'
+            local format = 'CUDNN_TENSOR_NCHW'
+            local nbDims = torch.IntTensor(1)
+            local filterDimA = torch.IntTensor({ 1, 1, 1 })


You can remove hard-coded 3 dimensions by

local minDim=3 local filterDimA=torch.ones(minDim)

replace '3' argument to GetFilterNdDescriptor call with minDim, and replace filterDimA[1] * filterDimA[2] * filterDimA[3] with filterDimA:prod()

This reverts commit 11fe833.

SeanNaren · 2016-04-15T22:50:17Z

Added batchFirst to the RNN, let me know what you guys think!

EDIT: should we support RNN ReLU/Tanh as separate modules? Only thing I'm concerned is about is that after setting self.mode to either, you have to reset the rnndescriptor (a call to reset()). Does this warrant a separate module?

ngimel · 2016-04-16T00:41:59Z

@SeanNaren, Tanh, ReLu and Sigmoid activations all have their own separate modules, so I don't see why Tanh or ReLU RNNs shouldn't. Great work and thanks for your help!
@elbamos, @SeanNaren, do you guys think we should handle saving hidden outputs/using them as hidden inputs somewhat more consistently than it is done now in the base RNN class? Say, have functions for copying hy to hx, and forgetting hx? Have an option to not output hy (cudnn allows this)? That would provide some (small) memory savings if you don't have to unroll beyond a single call to cudnn.

elbamos · 2016-04-16T01:29:19Z

@ngimel Maybe other people would, but I do not need the ability to drill-down into hx and hy. I would, though, benefit if we had a choice between keeping all hidden outputs versus only the last output from the sequence, but not in a B-RNN. Is that what you meant?

elbamos · 2016-04-16T04:18:03Z

I'm seeing this with the current version when I try to create a new BLSTM layer. The BLSTM layer I created last night continues to function and train.

th> cnn:add(cudnn.BLSTM(7040, 7040 * 16, 2, true))
cuda runtime error (77) : an illegal memory access was encountered at /tmp/luarocks_cutorch-scm-1-3384/cutorch/lib/THC/generic/THCStorage.c:147
stack traceback:
        [C]: at 0x7f5cfb1f3210
        [C]: at 0x7f5cfb1f8170
        [C]: in function 'CudaTensor'
        /usr/local/share/lua/5.1/cudnn/RNN.lua:114: in function 'resetDropoutDescriptor'
        /usr/local/share/lua/5.1/cudnn/RNN.lua:47: in function 'reset'
        /usr/local/share/lua/5.1/cudnn/RNN.lua:33: in function '__init'
        /usr/local/share/lua/5.1/cudnn/BLSTM.lua:4: in function '__init'
        /usr/local/share/lua/5.1/torch/init.lua:91: in function </usr/local/share/lua/5.1/torch/init.lua:87>
        [C]: in function 'BLSTM'
        [string "_RESULT={cnn:add(cudnn.BLSTM(7040, 7040 * 16,..."]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:651: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x00405ea0

SeanNaren · 2016-04-16T09:12:30Z

@elbamos that's a big hidden dim you have there, I don't think I can fit that onto my memory in the first place to test! It works alright with a smaller inputdim on my end however.

Added separate modules for Tanh and for ReLU. @ngimel Sounds great but I'm sure you are much more informed than me, so any guidance on how to approach this would be great :)

EDIT: should've done this some time ago, added you as as collab ngimel feel free to change anything you see fit!

EDIT2: Don't want to spam too much but tests are now much faster. Should've realised earlier it was the manual for loops that were taking up all our time...

SeanNaren · 2016-04-17T12:38:55Z

@borisfom I think the support is mostly done and tests are looking good, how would you suggest us merging the stuff here and RNN into main? Shall I rebase onto R5 into a new branch and open a new request with that branch into R5?

borisfom · 2016-04-18T04:13:51Z

Please hold on - I am rebasing my R5 to upstream R5 (nontrivial exercise due to selective merge they did earlier). This will likely result in a new branch - then I will ask you to do a PR against that branch.

ngimel · 2016-04-18T16:51:06Z

@SeanNaren Thanks! Looks like neither you nor @elbamos need more flexibility for hidden layers, so we can leave that till later date. You guys serve as proxy for real RNN users ;-)
@elbamos - currently you can not keep all your hidden outputs if you are processing your whole sequence in a single call to cudnn. If you really need it, you need to construct your network as a series of calls to cudnn, one for every time step - pretty much like its done in other torch rnn packages. Then you'll have all your hidden outputs (and you'll also lose performance optimizations beyond "Optimizing a single iteration" listed in the blogpost. Though "Pre-Transposing the Weight Matrix" possibly can be used even for this case, but it will require changing APIs, so not in this cudnn version:-). Let me know if you think this is an important case, may be we should have an API call for changing weight matrix layout).

borisfom · 2016-04-18T17:32:34Z

@SeanNaren : actually, please nevermind, I will merge the PR before rebase.
I will have to do some squashing (rebase -i HEAD~n) before merging it back to soumith's branch (ideally down to 2 or 3 commits) : please let me know if you would rather do it yourself on your branch now.

SeanNaren · 2016-04-18T17:35:59Z

@ngimel, I think I see what you mean now, if it is a feature people want once the PR is merged then I'll be more than happy to take a look :)
@borisfom I'll help you out on the part, give me some time and I'll rebase to a new branch ASAP.

borisfom · 2016-04-18T17:39:19Z

@SeanNaren: yes the best course would be for you to create a new branch from current one, and then squash that one to very few commits, then send a new PR from that branch (unless you can change the branch in the PR directly)

SeanNaren · 2016-04-18T18:23:46Z

@borisfom, Right here is the rebased branch, hopefully its alright just being 1 commit, let me know what you think!

If its good let me know what branch you want the PR to.

borisfom · 2016-04-18T18:36:43Z

SeanNaren: yes one commit is perfect! Please do PR against branch R5-re.

SeanNaren · 2016-04-26T10:35:14Z

@borisfom @ngimel thought this may be a suitable place to ask, how difficult would it be to support summation of the output layers rather than concat for BRNNs?

ngimel · 2016-04-26T17:38:14Z

You can always add your outputs outside of cuDNN. We went with concatenation rather than addition because it preserves the information and leaves the user free to postprocess the output the way user sees fit. There is no overlap between layers in a bidirectional network, so if you split layers and add the outputs yourself, you should not loose much performance.

SeanNaren · 2016-04-26T17:51:57Z

Awesome thanks!

Spatial/Volumetric dilated convolution added with tests.

SeanNaren added 4 commits April 11, 2016 20:21

Changed local/global functions to refer to self for inheritance

5d9a7b4

Added BLSTM based on RNN implementation

dec350f

added BLSTM to init

9bfed85

removed self call in makeContiguous

fe29e8b

SeanNaren added 3 commits April 13, 2016 15:26

Added rnn test (for basic RELU)

bdc97fa

Removed c sum check

e9de79f

Removed print statements

25f5383

SeanNaren mentioned this pull request Apr 13, 2016

R5 update soumith/cudnn.torch#159

Closed

SeanNaren added 3 commits April 13, 2016 18:07

Fixed call to input grads

d115fac

Added test for all RNN types

0df09da

Added description

11fe833

SeanNaren added 2 commits April 13, 2016 18:47

Changed tolerance to fix weight difference, small comment change

6406b2a

Exposed rnn variable to make easier to replace

94c1c22

ngimel reviewed Apr 13, 2016
View reviewed changes

SeanNaren added 3 commits April 13, 2016 19:21

Added base test for BLSTM

18ce0fb

Revert "Added description"

291bd50

This reverts commit 11fe833.

Fixed change of RNN module name

b09c184

SeanNaren added 3 commits April 15, 2016 23:15

Fixed transpose name for grads

ba105f0

Added all transpose operations

d9062ee

Added batchFirst to test params

df632a9

SeanNaren added 2 commits April 15, 2016 23:51

Added small comment to clarify batchFirst

e210208

Added description of recurrent modules

c420df1

SeanNaren added 2 commits April 16, 2016 09:51

Added separate RNN modules

a334ae2

Added modules to init

079a11c

SeanNaren added 4 commits April 16, 2016 10:21

Added RNNReLU/Tanh description

0a9f8ca

Added batchFirst to params

2cff977

Changed module calls in tests

fc7e761

Use torch sum instead of manual loop

50d4f7c

Put resizing of tensors in a better place

b89f6c2

SeanNaren closed this Apr 23, 2016

borisfom added a commit that referenced this pull request Aug 9, 2017

Merge pull request #3 from csarofeen/r.1116

d16fa06

Spatial/Volumetric dilated convolution added with tests.

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added BRNN support #3

Added BRNN support #3

SeanNaren commented Apr 11, 2016

borisfom commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 13, 2016

SeanNaren commented Apr 13, 2016

SeanNaren commented Apr 13, 2016

ngimel commented Apr 13, 2016

SeanNaren commented Apr 13, 2016

ngimel Apr 13, 2016

SeanNaren commented Apr 15, 2016 •

edited

Loading

ngimel commented Apr 16, 2016

elbamos commented Apr 16, 2016

elbamos commented Apr 16, 2016

SeanNaren commented Apr 16, 2016 •

edited

Loading

SeanNaren commented Apr 17, 2016 •

edited

Loading

borisfom commented Apr 18, 2016

ngimel commented Apr 18, 2016

borisfom commented Apr 18, 2016

SeanNaren commented Apr 18, 2016

borisfom commented Apr 18, 2016

SeanNaren commented Apr 18, 2016 •

edited

Loading

borisfom commented Apr 18, 2016

SeanNaren commented Apr 26, 2016

ngimel commented Apr 26, 2016

SeanNaren commented Apr 26, 2016

Added BRNN support #3

Added BRNN support #3

Conversation

SeanNaren commented Apr 11, 2016

borisfom commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 12, 2016

SeanNaren commented Apr 12, 2016

ngimel commented Apr 13, 2016

SeanNaren commented Apr 13, 2016

SeanNaren commented Apr 13, 2016

ngimel commented Apr 13, 2016

SeanNaren commented Apr 13, 2016

ngimel Apr 13, 2016

Choose a reason for hiding this comment

SeanNaren commented Apr 15, 2016 • edited Loading

ngimel commented Apr 16, 2016

elbamos commented Apr 16, 2016

elbamos commented Apr 16, 2016

SeanNaren commented Apr 16, 2016 • edited Loading

SeanNaren commented Apr 17, 2016 • edited Loading

borisfom commented Apr 18, 2016

ngimel commented Apr 18, 2016

borisfom commented Apr 18, 2016

SeanNaren commented Apr 18, 2016

borisfom commented Apr 18, 2016

SeanNaren commented Apr 18, 2016 • edited Loading

borisfom commented Apr 18, 2016

SeanNaren commented Apr 26, 2016

ngimel commented Apr 26, 2016

SeanNaren commented Apr 26, 2016

SeanNaren commented Apr 15, 2016 •

edited

Loading

SeanNaren commented Apr 16, 2016 •

edited

Loading

SeanNaren commented Apr 17, 2016 •

edited

Loading

SeanNaren commented Apr 18, 2016 •

edited

Loading