Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you separate the code movement and refactoring to a separate patch, and make this patch be just a bug fix?
I reverted the refactoring changes as suggested. |
There is only a couple variable renames I kept to make all names consistent across the |
Let's wait on #519 and also add a unit test. |
I haven't really found time to add a test for this PR yet, but I believe it should be merged anyway because it fixes a bug in the RNN gradients that will almost definitely cause undesired consequences for users, and add a TODO for a test. What do you think? |
That sounds good. I will merge and create an issue for adding a test. |
@rxwei @Shashi456 this tackles #518 and also includes the fix of #519. I haven't added a test yet, but will try to add one tomorrow. The fixed issues are:
zeroState
is a function that takes an exampleinput
as argument so that it can account for the batch size. In principle we can also switch it to just take the batch size as input, but in either case, until we can discuss the design of RNNs more extensively this solution suffices and fixes the current failures.