-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shingle filters that produce shingles of different size can create gigantic queries #23918
Labels
Comments
jimczi
added
:Search/Search
Search-related issues that do not fall into other categories
blocker
>bug
v5.3.1
labels
Apr 5, 2017
we should also update the docs to explain better config i'd consider removing the min/max shingles settings in favour of a single size, and removing output_unigram too? |
@clintongormley 👍 for a doc on what a better config would be (if possible without altering the outcome of a request) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Shingle filters creates a graph token stream that the query parser is now able to consume.
Though when shingles of different size are produced the number of paths in the graph can explode.
This is also the case when
output_unigram
is set to true.In 5.3 all paths are generated before building the query so a node can OOM easily on a single big input query. In 5.4 and beyond we detect the explosion earlier but we fail the entire request.
Instead we should be able to detect the problematic token filters and disable the graph analysis for these fields.
The text was updated successfully, but these errors were encountered: