Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BeamSearchDecoder incorrectly truncates results when used with dynamic_decode #13536

Closed
bdaskalov opened this issue Oct 6, 2017 · 10 comments
Closed
Assignees
Labels
stat:contribution welcome Status - Contributions welcome type:bug Bug

Comments

@bdaskalov
Copy link
Contributor

System information (irrelevant for this bug)

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04/Any
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v1.3.0-rc2-20-g0787eee 1.3.0
  • Python version: Python 3.5.2 :: Continuum Analytics, Inc.
  • Bazel version (if compiling from source): N/A
  • CUDA/cuDNN version: irrelevant
  • GPU model and memory: irrelevant
  • Exact command to reproduce: irrelevant

Describe the problem

tf.contrib.seq2seq.BeamSearchDecoder incorrectly truncates some of the results because the same index was previously used for a beam member that ended at a earlier step.

The root of the problem is that the while_loop body in dynamic_decode assumes that sequences are independent and will finish only once. In the same time BeamSearchDecoder creates a tree-like structure where a beam index can be reused in a later step for a state that originates from a different parent index. This causes the decoding loop to sometimes record the wrong sequence length for a beam member. Then this wrong sequence length is passed to BeamSearchDecoder.finalize which returns a truncated sequence.

Source code / logs

I use the following code to workaround the problem. This causes the right sequence to be returned but still the length returned by dynamic_decode is wrong.

class FixedBeamSearchDecoder(seq2seq.BeamSearchDecoder):
    def finalize(self, outputs, final_state, sequence_lengths):
        # BeamSearchDecoder does not follow the correct semantics of the the finished flag
        # which results in taking wrong length here and getting wrong decoded string.
        # We substitute the sequence length recorded by dynamic_decoder (which is wrong because
        # of the wrong finished flag returned by BeamSearchDecoder.step) with the length
        # recorded in BeamSearchState which is correct.
        return super().finalize(outputs, final_state, final_state.lengths)
@bdaskalov
Copy link
Contributor Author

@ebrevdo Can you take a look? I see that you wrote the seq2seq library. I wanted to submit a fix but I don't see how to correct this problem without changing some of the library's public inteface.

@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 6, 2017

Seems ok to update the BeamSearchDecoder.finalize to use final_state.lengths instead of sequence_lengths -- looks like this fixes a couple of other open issues.

We could consider having finalize return new updated sequence lengths to decode_dynamic as well.

@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 6, 2017

Thanks for catching this! Could you send a PR with the fix and a unit test that catches it?

@asimshankar asimshankar added type:bug Bug stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome and removed stat:awaiting response Status - Awaiting response from author labels Oct 7, 2017
@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 13, 2017

Will look into submitting a fix.

@ebrevdo ebrevdo self-assigned this Oct 13, 2017
@bdaskalov
Copy link
Contributor Author

Sorry, I've been meaning to make a PR last week but never got to it.

@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 15, 2017 via email

@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 16, 2017

The problem is deeper and the solution requires some additional changes. I'll try to submit something in the next couple days.

@vrv vrv closed this as completed in 18f89c8 Oct 18, 2017
@WolfNiu
Copy link

WolfNiu commented Apr 30, 2018

Could anyone tell me when this bug was fixed. I couldn't find it in the release notes. Thank you! @ebrevdo

@guillaumekln
Copy link
Contributor

It was first released in TensorFlow 1.5.

@WolfNiu
Copy link

WolfNiu commented Apr 30, 2018

@guillaumekln Thank you for the info!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome type:bug Bug
Projects
None yet
Development

No branches or pull requests

5 participants