You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a followup from LUCENE-5182 ...due to the nature of FVH extract logic a simple phrase query can put a FHV into a super long running recursion. I had documents taking literally days to return form the extract phrases logic. I have a test that reproduces the problem and a possible fix. The reason for this is that the FVH never tries to early terminate if a phrase is already way beyond the slop coming from the phrase query. If there is a document with lot of occurrences or two or more terms in the phrase this literally tries to match all possible combinations of the terms in the doc. I don't think we can fix this FVH without rewriting it since this alg is freaking crazy and somehow n! of all the positions etc. I am not even sure what the Big-O of this is but I have a patch that tires to prevent this thing from going totally nuts.
The text was updated successfully, but these errors were encountered:
… high freq phrase terms.
Terminate phrase searches early if max phrase window is exceeded in
FastVectorHighlighter to prevent very long running phrase
extraction if phrase terms are high frequent. See LUCENE-5182
Closes#3543
… high freq phrase terms.
Terminate phrase searches early if max phrase window is exceeded in
FastVectorHighlighter to prevent very long running phrase
extraction if phrase terms are high frequent. See LUCENE-5182
Closeselastic#3543
This is a followup from LUCENE-5182 ...due to the nature of FVH extract logic a simple phrase query can put a FHV into a super long running recursion. I had documents taking literally days to return form the extract phrases logic. I have a test that reproduces the problem and a possible fix. The reason for this is that the FVH never tries to early terminate if a phrase is already way beyond the slop coming from the phrase query. If there is a document with lot of occurrences or two or more terms in the phrase this literally tries to match all possible combinations of the terms in the doc. I don't think we can fix this FVH without rewriting it since this alg is freaking crazy and somehow
n!
of all the positions etc. I am not even sure what the Big-O of this is but I have a patch that tires to prevent this thing from going totally nuts.The text was updated successfully, but these errors were encountered: