Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bidirectional RNN #1626

Merged
merged 6 commits into from Feb 20, 2019

Conversation

Projects
None yet
4 participants
@saksham189
Copy link
Contributor

commented Jan 1, 2019

TODOS

  • Add test

Refer #1580

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 1, 2019

I am not too sure about the merge operation of the two RNNs. Currently I concatenate the outputs and pass that through LogSoftmax layer. But I think that we might need to have a fully connected layer before that.
Should we use FFN class to perform the merge operation? @zoq

@saksham189 saksham189 force-pushed the saksham189:br branch from 13b194a to 36317cc Jan 4, 2019

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2019

I can't find any tests specific for bidirectional RNN. I have used one of the tests for RNN to test my implementation.

@zoq

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

If I remember right Alex Graves used the Reber Grammar test (already implemented) for BLSTM as well.

@zoq

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

About the merge operation, we could directly merge the results inside the BRNN class; let me know what you think.

Currently I concatenate the outputs and pass that through LogSoftmax layer

Shouldn't we take the mean over the two?

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

Shouldn't we take the mean over the two?

There are multiple ways in which the outputs can be combined ie. summing, average and concatenating. Concatenate is used as the default in most places. Then we might use a softmax layer or other hidden layers to produce the final output.

@zoq

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

Do you think we could introduce another template parameter something like: typename MergeLayerType = Concat<>()?

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 6, 2019

I would like to do that but the concat layer does a horizontal concatenation as compared to a vertical of the outputs from other layers.
https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/concat_impl.hpp#L64

Should I add a parameter to the layer or modify the current implementation? @zoq

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 8, 2019

Also, I think that we would be required to add a run parameter to Concat as in AddMerge.
I think in the forward pass of the concat we want run to be set to false since input to each layer inside Concat would be different and set to true when calling Backward. So, we should create a visitor to set value of run parameter.

@zoq

This comment has been minimized.

Copy link
Member

commented Jan 10, 2019

Should I add a parameter to the layer or modify the current implementation? @zoq

By modify the current implementation you mean the concat layer? Might be the best option here.

Also, I think that we would be required to add a run parameter to Concat as in AddMerge.

Agreed, do you like to do this as part of the PR?

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 13, 2019

Yes, i'll add the changes and update this pr soon.

@saksham189 saksham189 force-pushed the saksham189:br branch 7 times, most recently from d77010d to a5fa270 Jan 20, 2019

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Jan 21, 2019

@zoq this is ready for review.

If I remember right Alex Graves used the Reber Grammar test (already implemented) for BLSTM as well.

I couldn't find results of BLSTM over reber grammar test in the paper. But I was able to find that BLSTMs can converge over reber grammar test in 1000 iterations so, I have implemented a test on that.

Do you think we could introduce another template parameter something like: typename MergeLayerType = Concat<>()?

Added. I have also added a MergeOutputType which is the final output non-linearity to be used such as softmax, sigmoid etc.

I have followed the BRNN algorithm described in Alex Graves paper for implementing the brnn class which could be helpful while reviewing.
screenshot 2019-01-20 at 7 29 43 pm
screenshot 2019-01-20 at 7 29 49 pm

@zoq zoq added the s: needs review label Jan 24, 2019

@@ -0,0 +1,83 @@
/**
* @file run_set_visitor_impl.hpp
* @author Marcus Edel

This comment has been minimized.

Copy link
@zoq

zoq Jan 24, 2019

Member

Since you wrote the code, you should adjust the name here :)

* Train the specified BRNN network and the construct a Reber grammar dataset.
*/
template<typename RecurrentLayerType>
void ReberGrammarTestNetworkBRNN(const size_t hiddenSize = 4,

This comment has been minimized.

Copy link
@zoq

zoq Jan 24, 2019

Member

Do you think we could pass a model instantiation to make the test code more general? In this case we could we don't have to write a special version for each network. Let me know what you think.

This comment has been minimized.

Copy link
@saksham189

saksham189 Jan 25, 2019

Author Contributor

Yes that does reduce quite a bit code duplication.

@saksham189 saksham189 force-pushed the saksham189:br branch 5 times, most recently from b4a89f0 to 3712836 Jan 25, 2019

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Feb 7, 2019

@saksham189 Just a note, if you update the PR, I think it's useful that the changes you made are visible, this makes the review a lot easier since the reviewer can see right away what has changed (between updates). Let me know what you think.

Yup. sorry about the inconvenience. Will push changes in new commits from now on.

@zoq

This comment has been minimized.

Copy link
Member

commented Feb 9, 2019

@mlpack-jenkins test this please

@saksham189 saksham189 force-pushed the saksham189:br branch 2 times, most recently from d237ff4 to 56513a4 Feb 9, 2019

@saksham189 saksham189 force-pushed the saksham189:br branch from 56513a4 to aa4d7dc Feb 11, 2019

@saksham189 saksham189 force-pushed the saksham189:br branch from aa4d7dc to f618d1d Feb 11, 2019

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

@zoq this is ready for review now.

@@ -30,7 +30,7 @@ class GradientVisitor : public boost::static_visitor<void>
public:
//! Executes the Gradient() method of the given module using the input and
//! delta parameter.
GradientVisitor(arma::mat&& input, arma::mat&& delta);
GradientVisitor(arma::mat&& input, arma::mat&& delta, int layer = -1);

This comment has been minimized.

Copy link
@zoq

zoq Feb 12, 2019

Member

Do you think instead of using int layer = -1 we could provide a second constructor:

GradientVisitor(arma::mat&& input, arma::mat&& delta) : hasIndex(false), index(0)
{
}

GradientVisitor(arma::mat&& input, arma::mat&& delta, const size_t index) : hasIndex(true), index(index)
{
}

Let me know what you think.

This comment has been minimized.

Copy link
@saksham189

saksham189 Feb 12, 2019

Author Contributor

alright sounds good to me

This comment has been minimized.

Copy link
@saksham189

saksham189 Feb 13, 2019

Author Contributor

@zoq i have made the changes

@saksham189 saksham189 force-pushed the saksham189:br branch 2 times, most recently from ff70d98 to 39bfe45 Feb 12, 2019

@saksham189 saksham189 force-pushed the saksham189:br branch from 39bfe45 to 458db49 Feb 12, 2019

@zoq

zoq approved these changes Feb 13, 2019

Copy link
Member

left a comment

No further comments from my side, this looks good to me.

@ShikharJ

This comment has been minimized.

Copy link
Member

commented Feb 14, 2019

Nice work @saksham189. I'll review in the next few days and then we can merge this in.

@mlpack-bot

mlpack-bot bot approved these changes Feb 14, 2019

Copy link

left a comment

Second approval provided automatically after 24 hours. 👍

@zoq

This comment has been minimized.

Copy link
Member

commented Feb 16, 2019

One more note, if you do commit something, it would be helpful if the commit message does reflect the changes you did. That makes searching through the commit history a lot easier.

@ShikharJ
Copy link
Member

left a comment

I only took a quick look, and have some minor changes. I guess I'm not too familiar with BRNNs so maybe someone with more experience should provide second review.

layer->Backward(std::move(input), std::move(error),
std::move(delta), index);
}

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Feb 16, 2019

Member

Extra newline here.

layer->Gradient(std::move(input), std::move(delta),
std::move(layer->Gradient()), index);
}

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Feb 16, 2019

Member

Same here.

parameter.n_rows/2,
parameter.n_cols);
forwardGradient = arma::zeros<arma::mat>(
parameter.n_rows/2,

This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Feb 16, 2019

Member

Can you add a space between the \ 2 here and above?

backwardRNN.ResetDeterministic();
}


This comment has been minimized.

Copy link
@ShikharJ

ShikharJ Feb 16, 2019

Member

Extra newline used here.

@saksham189 saksham189 force-pushed the saksham189:br branch from bb846e2 to a9c82e5 Feb 18, 2019

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Feb 18, 2019

One more note, if you do commit something, it would be helpful if the commit message does reflect the changes you did. That makes searching through the commit history a lot easier.

I will make sure to do this from now. I have changed the previous commit message as well. @zoq

@zoq

This comment has been minimized.

Copy link
Member

commented Feb 18, 2019

Great, @saksham189 do you like to add anything else here, or is this ready?

@saksham189

This comment has been minimized.

Copy link
Contributor Author

commented Feb 18, 2019

This is ready from my side. Unless you have anything to add. @zoq

@zoq zoq merged commit 2863cba into mlpack:master Feb 20, 2019

6 checks passed

LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@zoq

This comment has been minimized.

Copy link
Member

commented Feb 20, 2019

Thanks again for the great contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.