Add fuzzy feature to common terms query #3502

Closed
mathieu007 opened this Issue Aug 13, 2013 · 3 comments

Projects

None yet

3 participants

@mathieu007

Hi guys,

I have a situation where i'd like to remove common words based on a cutoff frequency and a low frequent term minimum_should_match.

The common terms query is just perfect for this job!

But i would also like to have a "fuzziness" on low frequent terms and only on low frequent terms.

Using the query string query don't give the expected results because words like "and" would match words like "brand", "band", ect... with a fuzziness of 0.6.

I think adding a fuzzy features to common terms query would do the job.

What do you think?

Thank you

@s1monw s1monw was assigned Aug 13, 2013
@s1monw
Contributor
s1monw commented Aug 13, 2013

This is an interesting idea.... I will need to look closer into this how to make this work but it could work though. not sure if I get to it this week but I won't forget about it.

@mathieu007

Thanks @s1monw,

I just would like to thanks you guys for the wonderful job you did, i am new to elasticsearch and still learning, but the query style is very intuitive and when you get use to it there is almost nothing you can do.

No More SQL again, or at least much less...

@s1monw s1monw added the adoptme label Jul 4, 2014
@clintongormley
Member

Common terms is intended to make queries faster. Fuzzy ends up adding many extra terms, thus slowing down the common terms query. On top of that, fuzzy may produce high and low frequency terms, yet the low frequency terms are less likely to be the ones that are correct.

We have decided against support fuzzy with common terms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment