[Feature request] Extra phrase_suggestor filters #41862
Labels
>enhancement
:Search/Suggesters
"Did you mean" and suggestions as you type
Team:Search
Meta label for search team
Describe the feature: It would be great if there are some ways to filter out suggestions returned by the phrase_suggestor.
Working on the "Did you mean" functionality of our product search page, we have noticed a couple of suggestions ElasticSearch returns which are not really useful for our users. We understand why they are returned, and the suggestions all fit in the given settings, but as a user they are not helpful.
Requests:
1. Filter out suggestions if both input and suggestion are fully numeric.
So the input "301" should not return "300" just because it's more popular.
Note: The input "brand 999" should still return the suggestion "brand 99", if the later exists and the input doesn't.
2. Filter out suggestion if a number is split into two numbers, or into a number + word
The input "301" now also returns the suggestion "30 m"
The input "30100" now returns the suggestion "3 100"
3. Filter out suggestion if it's simply adding a space between a word and number
With the document "word 123", and the search query "word123". We already have the word_delimiter filter setup so it will match the document. But the suggester will still return "word 123".
On the other hand with the document "visitor card", the input "visitorcard" should return "visitor card" as suggestion.
4. Filter out suggestion if stemming the suggestion and stemming the query are equal
With the input "batteries", users are getting "battery" as suggestion. If the input and the suggestion are equal after stemming, the suggestion doesn't help. Because we already use the stemming filter for the document search query.
5. Filter out suggestion if asciifolding the suggestion and asciifolding the query are equal
With the input "facade", users are getting "façade" as a suggestion. We already use the asciifolding filter to make the document search work. So this suggestion doesn't help.
Some of these filters are possible to implement on the application side, but the stemming for example is not easy to accomplish. It would be best if these options are all optionally by adding flags/settings in the suggestion query.
Our suggestion fields
suggestionText, text field and no analyzersSubfields (all type text)
.trigram, shingle filter.
.reverse, reverse filter.
.reverse_lower, reverse, lowercase, shingle filters.
.lowercase, lowercase filter.
shingle filter is setup as min-size=2 and max-size=3.
The text was updated successfully, but these errors were encountered: