Fixed a couple RNN bugs. #522

eaplatanios · 2019-10-02T00:25:44Z

@rxwei @Shashi456 this tackles #518 and also includes the fix of #519. I haven't added a test yet, but will try to add one tomorrow. The fixed issues are:

Backpropagated gradients for RNN cells were computed wrongly, resulting in being unable to train RNNs (especially in cases where only the last time step output is being used -- e.g., sequence classification)
The default zero state for RNNs always had a batch size of 1 that could not be broadcast and caused failures (e.g., when concatenating the input and hidden state in LSTM cells). Now zeroState is a function that takes an example input as argument so that it can account for the batch size. In principle we can also switch it to just take the batch size as input, but in either case, until we can discuss the design of RNNs more extensively this solution suffices and fixes the current failures.

rxwei

Could you separate the code movement and refactoring to a separate patch, and make this patch be just a bug fix?

Sources/TensorFlow/Layers/Recurrent.swift

eaplatanios · 2019-10-02T00:43:17Z

I reverted the refactoring changes as suggested.

eaplatanios · 2019-10-02T00:44:28Z

There is only a couple variable renames I kept to make all names consistent across the callAsFunction and its corresponding VJP.

rxwei · 2019-10-02T00:46:20Z

Let's wait on #519 and also add a unit test.

eaplatanios · 2019-11-12T06:32:53Z

I haven't really found time to add a test for this PR yet, but I believe it should be merged anyway because it fixes a bug in the RNN gradients that will almost definitely cause undesired consequences for users, and add a TODO for a test. What do you think?

marcrasi · 2019-11-14T18:37:16Z

That sounds good. I will merge and create an issue for adding a test.

eaplatanios added 8 commits July 20, 2019 09:55

Marked initializer VJP parameters as '__owned'.

d29c48e

Merge remote-tracking branch 'upstream/master'

ff337e8

Merge remote-tracking branch 'upstream/master'

8274dd3

Merge remote-tracking branch 'upstream/master'

2db7fa0

Merge remote-tracking branch 'upstream/master'

34b0f2a

Merge remote-tracking branch 'upstream/master'

6e54cc5

Merge remote-tracking branch 'upstream/master'

c8b6dee

Fixed a couple RNN bugs.

99784e7

eaplatanios requested review from rxwei and dan-zheng October 2, 2019 00:25

eaplatanios added the kokoro:run label Oct 2, 2019

kokoro-team removed the kokoro:run label Oct 2, 2019

rxwei reviewed Oct 2, 2019

View reviewed changes

Sources/TensorFlow/Layers/Recurrent.swift Outdated Show resolved Hide resolved

Reverted refactoring.

22cd31a

rxwei approved these changes Oct 2, 2019

View reviewed changes

Shashi456 mentioned this pull request Oct 15, 2019

Fix RNN gradient accumulation. #519

Merged

marcrasi mentioned this pull request Nov 14, 2019

add test for #522 #555

Open

marcrasi merged commit 25c7cfe into tensorflow:master Nov 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed a couple RNN bugs. #522

Fixed a couple RNN bugs. #522

eaplatanios commented Oct 2, 2019

rxwei left a comment

eaplatanios commented Oct 2, 2019

eaplatanios commented Oct 2, 2019

rxwei commented Oct 2, 2019

eaplatanios commented Nov 12, 2019

marcrasi commented Nov 14, 2019

Fixed a couple RNN bugs. #522

Fixed a couple RNN bugs. #522

Conversation

eaplatanios commented Oct 2, 2019

rxwei left a comment

Choose a reason for hiding this comment

eaplatanios commented Oct 2, 2019

eaplatanios commented Oct 2, 2019

rxwei commented Oct 2, 2019

eaplatanios commented Nov 12, 2019

marcrasi commented Nov 14, 2019