Skip to content
This repository was archived by the owner on Jul 18, 2024. It is now read-only.

Conversation

@aasseman
Copy link
Contributor

@aasseman aasseman commented Apr 25, 2019

(Depends on PR #18 #17)

Adding RNN Decoder w/ attention, as per this pytorch tutorial. Itself based on the paper Neural Machine Translation by Jointly Learning to Align and Translate.

@aasseman aasseman added enhancement New feature or request WIP Work in progress, not ready for merge yet. labels Apr 25, 2019
@aasseman aasseman requested a review from tkornuta-ibm April 25, 2019 01:59
@aasseman
Copy link
Contributor Author

aasseman commented Apr 25, 2019

Seems to work well enough for now to run wikitext_language_modeling_encoder_attndecoder.yml, which is a basic copy task.
Supports only single layer GRU for now. Will probably not try extending that. Are multiple layers useful?

Will try to add the translation task from the pytorch tutorial, as to compare apples to apples if my implementation is doing what it should.

@aasseman aasseman changed the title (depends on PR #13) (WIP) RNN Decoder with attention (WIP) RNN Decoder with attention Apr 26, 2019
@tkornuta-ibm
Copy link
Contributor

This pull request introduces 1 alert when merging df60fb4 into aed980d - view on LGTM.com

new alerts:

  • 1 for Unused local variable

Comment posted by LGTM.com

@aasseman aasseman changed the title (WIP) RNN Decoder with attention (Depends on PR #18 #17)(WIP) RNN Decoder with attention Apr 26, 2019
@tkornuta-ibm
Copy link
Contributor

This pull request introduces 1 alert when merging 0138ef2 into aed980d - view on LGTM.com

new alerts:

  • 1 for Unused local variable

Comment posted by LGTM.com

@aasseman aasseman changed the title (Depends on PR #18 #17)(WIP) RNN Decoder with attention (Depends on PR #18 #17) RNN Decoder with attention Apr 26, 2019
@aasseman aasseman marked this pull request as ready for review April 26, 2019 17:56
@aasseman aasseman removed the WIP Work in progress, not ready for merge yet. label Apr 26, 2019
@aasseman
Copy link
Contributor Author

Ready for review.
Got good results on the translation task.
One thing that could be checked (visually), if time permits, is the attention matrix of the GRU decoder, to really make sure that we get similar results to the paper.

Copy link
Contributor

@tkornuta-ibm tkornuta-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please cleanup that wikitext_language_modeling_seq2seq.yml file

Aside of that, some of the changes were already merged... why GH is not seeing that? Are you squashing the commits or sth?

hidden_size: 50
num_layers: 1
use_logsoftmax: False
<<<<<<< Updated upstream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, I guess the "updated upstream" is the right one... but not sure..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I got mixed up with my stashes. I checked-out the develop version of the two files in question, and that solved it.

@aasseman aasseman requested a review from tkornuta-ibm April 27, 2019 01:25
@tkornuta-ibm tkornuta-ibm merged commit 02a9179 into IBM:develop Apr 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants