Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FVH can end in very very long running recursion on phrase highlight #3543

Closed
s1monw opened this issue Aug 20, 2013 · 1 comment
Closed

FVH can end in very very long running recursion on phrase highlight #3543

s1monw opened this issue Aug 20, 2013 · 1 comment
Assignees

Comments

@s1monw
Copy link
Contributor

s1monw commented Aug 20, 2013

This is a followup from LUCENE-5182 ...due to the nature of FVH extract logic a simple phrase query can put a FHV into a super long running recursion. I had documents taking literally days to return form the extract phrases logic. I have a test that reproduces the problem and a possible fix. The reason for this is that the FVH never tries to early terminate if a phrase is already way beyond the slop coming from the phrase query. If there is a document with lot of occurrences or two or more terms in the phrase this literally tries to match all possible combinations of the terms in the doc. I don't think we can fix this FVH without rewriting it since this alg is freaking crazy and somehow n! of all the positions etc. I am not even sure what the Big-O of this is but I have a patch that tires to prevent this thing from going totally nuts.

@ghost ghost assigned s1monw Aug 20, 2013
s1monw added a commit to s1monw/elasticsearch that referenced this issue Aug 20, 2013
@s1monw s1monw closed this as completed in 9af7a85 Aug 20, 2013
s1monw added a commit that referenced this issue Aug 20, 2013
… high freq phrase terms.

Terminate phrase searches early if max phrase window is exceeded in
FastVectorHighlighter to prevent very long running phrase
extraction if phrase terms are high frequent. See LUCENE-5182

Closes #3543
@synhershko
Copy link
Contributor

So much for being a _Fast_VH, too many long loops discovered lately :)

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
… high freq phrase terms.

Terminate phrase searches early if max phrase window is exceeded in
FastVectorHighlighter to prevent very long running phrase
extraction if phrase terms are high frequent. See LUCENE-5182

Closes elastic#3543
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants