New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phrase suggest direct generator possibly not obeying min_word_len 0.90 #3037
Comments
The parameter is |
|
@clintongormley While I did have a typo in the term suggest, the phrase suggest example is working and demonstrates the issue. Could you reopen? I will clarify that the gist was demonstrating that the term suggest is providing the term "iced" but I believe the candidate generator in the phrase suggest is not provided the term "iced" for the phrase suggest to consider because of the word length. |
this seems to be a bug in the min_doc_freq smoothing. The good thing is that this only happens if your query term has a freq = 1 and the replacement has a freq = 1 as well. So in practice this might not be an issue. I will have a fix soon, in the meanwhile this should help:
|
Using an automatically detected 'min_doc_freq' if suggest type is set to 'always' is counter intuitive. If we suggest always ignore the frequency and set threshold frequency to 0 to allow all possible candidates to be drawn if they are within the given bounds. Closes #3037
@s1monw Is there any chance that max_term_freq is not being obeyed as well with "always?" While this patch fixed this test issue, I actually have a situation where ice appears thousands of times and iced several hundred. I see "iced" appear from the term suggest, but it's like the phrase suggest never gets it. |
|
@s1monw I have to set max_term_freq in the term suggest to 0.999 (99.9%) to have term show up. However, when I do this in phrase suggest, it's as though the candidate is not generated. |
@s1monw This did fix the issue I had. It seems to respect max_term_freq now with 0.9.1 |
@jtreher I think I did! |
Using an automatically detected 'min_doc_freq' if suggest type is set to 'always' is counter intuitive. If we suggest always ignore the frequency and set threshold frequency to 0 to allow all possible candidates to be drawn if they are within the given bounds. Closes elastic#3037
I ran into an issue where the phrase suggester does not seem to be generating terms for words of length less than the default of four even with the min_word_len set to 0,1,2, or 3. When I run a term suggest, the term comes back as expected.
Here is a gist reproducing the issue:
https://gist.github.com/jtreher/5577747
The text was updated successfully, but these errors were encountered: