-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bidirectional_dynamic_rnn: make sequence length optional #5588
Comments
@ebrevdo My understanding is that sequence lengths should be required, since it doesn't make sense to do dynamic rnn fanciness otherwise. However, that language does seem to imply it's optional. Thoughts? |
You don't need sequence_lengths to do dynamic RNN; the fact that your time dimension is one of the axes of the inputs, and can vary from step to step, is what makes it dynamic. Providing sequence_lengths is necessary for correctness though: if you pass these in, your final state value will be the true final state at the end of the sequence. Otherwise it'll just keep calculating to the end of your inputs (which may include padding inputs - and this will throw your state off). That said, it does seem that if sequence_lengths is None, we should just use array_ops.reverse to do a blind reversal. The way bidirectional_rnn does. PRs welcome. |
A follow up question: if we use padding inputs, and provide sequence_lengths to bidirectional_dynamic_rnn, would the backwark rnn read from the padding end or the true end of these padding inputs?
|
This looks like a subtle bug introduced in a refactoring. Instead of
array_ops.reverse_sequence, the bidirectional code should call _reverse_seq
in that module. I'll fix it.
…On Fri, Mar 3, 2017 at 10:57 AM, Dmitry Persiyanov ***@***.*** > wrote:
Is there someone who works on this issue? I could prepare pull request.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5588 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABtim9q5PVqfE40MZiBQN7VS0uNWCW6uks5riGKqgaJpZM4Kw0_D>
.
|
@vanpersie32 Thank you so much for pointing this out. |
Does this operation has performance degraded? |
No more than the current version.
…On Thu, Dec 14, 2017, 2:22 AM Bayberry Z. ***@***.***> wrote:
@ebrevdo <https://github.com/ebrevdo>
Otherwise it'll just keep calculating to the end of your inputs (which may
include padding inputs - and this will throw your state off). That said, it
does seem that if sequence_lengths is None, we should just use
array_ops.reverse to do a blind reversal.
Does this operation has performance degraded?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5588 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABtim0tn0wgGJ1nxopq3Z8_MJcpaHFJZks5tAPbKgaJpZM4Kw0_D>
.
|
So in this example: inputs = [[9,8,0,0]] |
Qualifier: very new to TF, so I may be missing something very obvious. If so, apologies.
DOCUMENTATION:
Is a little conflicting:
"sequence_length: An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences." <-- implies sequence_length is required
However...
"The initial state for both directions is zero by default (but can be set optionally) and no intermediate states are ever returned -- the network is fully unrolled for the given (passed in) length(s) of the sequence(s) or completely unrolled if length(s) is not given." <-- implies sequence_length is not required.
FUNCTIONALITY"
Base on the function itself ("dynamic"), I would assume that allowing sequence_length=None is intended (although, in either case, the documentation should be straightened out).
Assuming we do want to allow seq_length=None, the offending portion within bidirectional_dynamic_rnn would appear to be:
inputs_reverse = array_ops.reverse_sequence(...seq_lengths=sequence_length...)
reverse_sequence appears to require seq_lengths != None
Oddly enough, bidirectional_rnn may(?) be robust to this problem, as it uses _reverse_seq, which handles lengths=None.
That said, I haven't tested this extensively, as I'm of course not 100% sure I've interpreted the above correctly and as to what the intended functionality is.
Given guidance, I could take a stab at a pull request to address the above. Or maybe this is a quick fix/known issue (or non-issue...).
The text was updated successfully, but these errors were encountered: