You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traditionally FST suggesters needed to create an in-memory structure upfront, which needed to be in sync with the data inserted/deleted. This step to create a FST can be really expensive and long lasting on production systems.
So, why not trying to create an efficient FST alike structure on index time, load that quickly into memory and use this for suggestions?
Before deep diving into implementation details, let's start with a small sample
As you can see, the text returned is the provided output during indexing. Also the payload is included, which might carry a reference ID to the artist and thus makes it easy to retrieve further information.
Mapping options
In order to support prefix suggestion the field has to be marked as type completion.
While the type field is mandatory, the index_analyzer and search_analyzer fields can be omitted. The simple analyzer is used by default.
Payloads
If you want to return payloads, you have to explicitely enable them by using payloads: true - payloads can contain arbitrary JSON, but must be a JSON object, with opening { and closing } - no pure strings or arrays allowed.
Preserve separators
In addition, you can set preserve_separators: false in case you in case you want to return "Foo Fighters" when searching for "foof" (using the correct analyzer of course).
Preserve position increments
You can set preserve_position_increments: false in order to not count increase position increments, which is needed if the first word is a stopword and you are using an analyzer to filter out stopwords. This would allow you to suggest for b and get back The Beatles
Indexing
Simple case
The most simple case to index is like this
"suggestField" : [ "The Prodigy Firestarter", "Firestarter"]
Depending on the analyzer used
Outputs
Defining an output will always return the output for a found suggestion.
"suggestField" : {
"input" : [ "The Prodigy Firestarter", "Firestarter"],
"output" : "The Prodigy, Firestarter",
}
Weights
You should define custom weights instead of relying on the default one (see the drawbacks section). The weight must be an positive integer (no float) and defines the order of your suggestions.
"suggestField" : {
"input" : [ "The Prodigy Firestarter", "Firestarter"],
"output" : "The Prodigy, Firestarter",
"weight" : 42
}
Also custom weights can make your suggestions valuable. Using weights you could boost the most played song or the best rated hotel first in your suggestions.
Search
Searches are working exactly like the phrase and term suggesters
If you do not specify a weight, the term frequency is used. This only makes sense if you optimize to a single segment or have large segments. If you do not, having custom weights might yield the results you are awaiting. So using term frequences as a weight indicator is not the best solution and you should set weight yourself.
The text was updated successfully, but these errors were encountered:
This commit introduces near realtime suggestions. For more information about
its usage refer to github issue elastic#3376
From the implementation point of view, a custom AnalyzingSuggester is used
in combination with a custom postingsformat (which is not exposed to the user
anywhere for him to use).
This commit introduces near realtime suggestions. For more information about
its usage refer to github issue elastic#3376
From the implementation point of view, a custom AnalyzingSuggester is used
in combination with a custom postingsformat (which is not exposed to the user
anywhere for him to use).
Closeselastic#3376
This commit introduces near realtime suggestions. For more information about
its usage refer to github issue #3376
From the implementation point of view, a custom AnalyzingSuggester is used
in combination with a custom postingsformat (which is not exposed to the user
anywhere for him to use).
Closes#3376
mute
pushed a commit
to mute/elasticsearch
that referenced
this issue
Jul 29, 2015
This commit introduces near realtime suggestions. For more information about
its usage refer to github issue elastic#3376
From the implementation point of view, a custom AnalyzingSuggester is used
in combination with a custom postingsformat (which is not exposed to the user
anywhere for him to use).
Closeselastic#3376
Note: This is an experimental feature!
Traditionally FST suggesters needed to create an in-memory structure upfront, which needed to be in sync with the data inserted/deleted. This step to create a FST can be really expensive and long lasting on production systems.
So, why not trying to create an efficient FST alike structure on index time, load that quickly into memory and use this for suggestions?
Before deep diving into implementation details, let's start with a small sample
Sample
Create a simple mapping
A request looks like this
This is the response
As you can see, the text returned is the provided output during indexing. Also the payload is included, which might carry a reference ID to the artist and thus makes it easy to retrieve further information.
Mapping options
In order to support prefix suggestion the field has to be marked as type
completion
.While the
type
field is mandatory, theindex_analyzer
andsearch_analyzer
fields can be omitted. Thesimple
analyzer is used by default.Payloads
If you want to return payloads, you have to explicitely enable them by using
payloads: true
- payloads can contain arbitrary JSON, but must be a JSON object, with opening{
and closing}
- no pure strings or arrays allowed.Preserve separators
In addition, you can set
preserve_separators: false
in case you in case you want to return "Foo Fighters" when searching for "foof" (using the correct analyzer of course).Preserve position increments
You can set
preserve_position_increments: false
in order to not count increase position increments, which is needed if the first word is a stopword and you are using an analyzer to filter out stopwords. This would allow you to suggest forb
and get backThe Beatles
Indexing
Simple case
The most simple case to index is like this
Depending on the analyzer used
Outputs
Defining an output will always return the output for a found suggestion.
Weights
You should define custom weights instead of relying on the default one (see the drawbacks section). The weight must be an positive integer (no float) and defines the order of your suggestions.
Also custom weights can make your suggestions valuable. Using weights you could boost the most played song or the best rated hotel first in your suggestions.
Search
Searches are working exactly like the phrase and term suggesters
Drawbacks
Using term frequency as default weight
If you do not specify a weight, the term frequency is used. This only makes sense if you optimize to a single segment or have large segments. If you do not, having custom weights might yield the results you are awaiting. So using term frequences as a weight indicator is not the best solution and you should set weight yourself.
The text was updated successfully, but these errors were encountered: