Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle edge cases in beam search #2557

Merged
merged 6 commits into from Mar 14, 2019

Conversation

Projects
None yet
3 participants
@epwalsh
Copy link
Contributor

epwalsh commented Mar 1, 2019

Addresses the edge case brought up in #2486 (fixes #2486) as well as another. This actually turned out to be a little more nuanced...

I first thought the issue brought up was caused by start_predictions being the end_index, but it actually occurs in general with a beam size of 1 when the first predictions that the step function produces are the end_index, regardless of what start_predictions are, i.e. at this line:

start_class_log_probabilities, state = step(start_predictions, start_state)

The other edge case is similar, and occurs when the beam size is smaller than the number of valid (non-zero probability) transitions that the step function produces. For example, this could happen in a semantic parsing task where a masked log softmax is used to create predicted log probs for valid next actions. Though this doesn't cause the beam search to crash per se, I thought it would still be good to warn the user in these cases since some of the predicted sequences may be improbable.

@matt-gardner matt-gardner requested a review from brendan-ai2 Mar 2, 2019

@matt-gardner

This comment has been minimized.

Copy link
Member

matt-gardner commented Mar 2, 2019

@brendan-ai2, when you have time for this, I think you have the most context here.

@brendan-ai2
Copy link
Member

brendan-ai2 left a comment

Thanks for tackling these edge cases, @epwalsh! Just a few minor comments.

@@ -110,6 +111,11 @@ def search(self,
# shape: (batch_size, beam_size), (batch_size, beam_size)
start_top_log_probabilities, start_predicted_classes = \
start_class_log_probabilities.topk(self.beam_size)
if self.beam_size == 1 and (start_predicted_classes == self._end_index).all():

This comment has been minimized.

Copy link
@brendan-ai2

brendan-ai2 Mar 8, 2019

Member

Could you elaborate on why this only affects beams of size one? A test demonstrating that it can't happen with a beam size of two would be ideal. Sorry if I'm missing something obvious...

This comment has been minimized.

Copy link
@epwalsh

epwalsh Mar 8, 2019

Author Contributor

So with a beam size > 1, topk of course returns multiple actions, so you are guaranteed to have at least one non-empty beam (since only one index returned from topk could be the end_index). Therefore the bug described in #2486 is avoided with beam size > 1 (i.e. it was only possible for that bug to happen with beam size 1). FWIW I did test that it would fail with a beam size of 2 before implementing the explicit check.

This comment has been minimized.

Copy link
@brendan-ai2

brendan-ai2 Mar 12, 2019

Member

Thanks for elaborating!

Show resolved Hide resolved allennlp/tests/nn/beam_search_test.py Outdated
warnings.warn("Infinite log probabilities encountered. Some final sequences may not make sense. "
"This can happen when the beam size is larger than the number of valid (non-zero "
"probability) transitions that the step function produces.",
RuntimeWarning)

This comment has been minimized.

Copy link
@brendan-ai2

brendan-ai2 Mar 8, 2019

Member

Hmm, this is a bit unsatisfying, but I guess filtering out the impossible results thereby changing the shape of the returned tensor might be a worse fix. Could you add to the docstring that it's the responsibility of the caller to check the returned probabilities for invalid results?

This comment has been minimized.

Copy link
@epwalsh

epwalsh Mar 8, 2019

Author Contributor

added to the docstring

@epwalsh

This comment has been minimized.

Copy link
Contributor Author

epwalsh commented Mar 9, 2019

Looks like the last build was manually interrupted, I'm not sure why.

@matt-gardner

This comment has been minimized.

Copy link
Member

matt-gardner commented Mar 9, 2019

I know there was some kind of restart that happened a day or two ago; not sure if this was related or not. I just restarted the build.

@epwalsh

This comment has been minimized.

Copy link
Contributor Author

epwalsh commented Mar 9, 2019

@matt-gardner thanks!

@brendan-ai2
Copy link
Member

brendan-ai2 left a comment

Appreciate the elaboration, this makes sense now. Just a few more nitpicking comments and then we should be good. Thanks again!

Show resolved Hide resolved allennlp/nn/beam_search.py Outdated
Show resolved Hide resolved allennlp/nn/beam_search.py Outdated
Show resolved Hide resolved allennlp/tests/nn/beam_search_test.py Outdated
Show resolved Hide resolved allennlp/tests/nn/beam_search_test.py Outdated
Show resolved Hide resolved allennlp/nn/beam_search.py

brendan-ai2 and others added some commits Mar 12, 2019

@brendan-ai2
Copy link
Member

brendan-ai2 left a comment

Thanks, @epwalsh!

@brendan-ai2 brendan-ai2 merged commit 720d306 into allenai:master Mar 14, 2019

3 checks passed

Pull Requests (AllenNLP Library) TeamCity build finished
Details
codecov/patch 100% of diff hit (target 90%)
Details
codecov/project 92% (+<1%) compared to 0f7bcf5
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.