"Did you mean" spellchecking #911

Closed
sindresorhus opened this Issue May 8, 2011 · 53 comments

Comments

Projects
None yet
@sindresorhus

Google's "Did you mean" feature is very useful. Would be awesome if ES could implement this.

Lucene has pulled in the SpellChecker contrib. Maybe ES could expose that?

Ex. if I specify suggestSimilar with some optional parameters in my search object I could get back an array with some suggestions.

@keteracel

This comment has been minimized.

Show comment
Hide comment
@keteracel

keteracel May 9, 2011

you can implement this yourself by having a search term index, probably using ngram and then sorted by popularity.

you can implement this yourself by having a search term index, probably using ngram and then sorted by popularity.

@sindresorhus

This comment has been minimized.

Show comment
Hide comment
@sindresorhus

sindresorhus May 16, 2011

Can you give an example?

Can you give an example?

@keteracel

This comment has been minimized.

Show comment
Hide comment
@keteracel

This comment has been minimized.

Show comment
Hide comment
@keteracel

keteracel May 16, 2011

But I also see that Lucene has pulled in the SpellChecker contrib: http://lucene.apache.org/java/3_1_0/api/all/org/apache/lucene/search/spell/SpellChecker.html so I guess ES could expose that.

But I also see that Lucene has pulled in the SpellChecker contrib: http://lucene.apache.org/java/3_1_0/api/all/org/apache/lucene/search/spell/SpellChecker.html so I guess ES could expose that.

@sindresorhus

This comment has been minimized.

Show comment
Hide comment
@sindresorhus

sindresorhus May 17, 2011

@keteracel Red the article you linked. Looks interesting, but is probably more than I can handle at the moment. I really think something as useful as this should be in ES by default. I've updated the issue with a better description.

@keteracel Red the article you linked. Looks interesting, but is probably more than I can handle at the moment. I really think something as useful as this should be in ES by default. I've updated the issue with a better description.

@kimchy

This comment has been minimized.

Show comment
Hide comment
@kimchy

kimchy May 18, 2011

Member

The current spell checker requires building an auxilery index in order to support it (and moreover, requires reindexing the data periodically). In Lucene 4.0, since fuzzy queries are much faster, spell checking can be done on the main index. So, the logic is that it makes little sense to incorperate a feature that is quite heavy weigth currently, and not simply waiting to easily implement it with 4.0 is out.

Member

kimchy commented May 18, 2011

The current spell checker requires building an auxilery index in order to support it (and moreover, requires reindexing the data periodically). In Lucene 4.0, since fuzzy queries are much faster, spell checking can be done on the main index. So, the logic is that it makes little sense to incorperate a feature that is quite heavy weigth currently, and not simply waiting to easily implement it with 4.0 is out.

@sindresorhus

This comment has been minimized.

Show comment
Hide comment
@sindresorhus

sindresorhus May 18, 2011

Agreed, that's the best solution. Any idea when 4.0 will be out?

Agreed, that's the best solution. Any idea when 4.0 will be out?

@kimchy

This comment has been minimized.

Show comment
Hide comment
@kimchy

kimchy May 18, 2011

Member

No, no due date yet. It seems like the pace is being picked up towards a release, but it will take a few months I think.

Member

kimchy commented May 18, 2011

No, no due date yet. It seems like the pace is being picked up towards a release, but it will take a few months I think.

@sindresorhus

This comment has been minimized.

Show comment
Hide comment
@sindresorhus

sindresorhus May 19, 2011

Ok, thanks ;) Looking forward to it.

Ok, thanks ;) Looking forward to it.

@richardsyeo

This comment has been minimized.

Show comment
Hide comment
@richardsyeo

richardsyeo Jul 20, 2011

We would very much like this feature too.

We would very much like this feature too.

@naquad

This comment has been minimized.

Show comment
Hide comment
@naquad

naquad Jul 27, 2011

Hi.

Are there any news on this? Tired of running around with ASpell :(

naquad commented Jul 27, 2011

Hi.

Are there any news on this? Tired of running around with ASpell :(

@j

This comment has been minimized.

Show comment
Hide comment
@j

j Sep 5, 2011

+1

j commented Sep 5, 2011

+1

@beau-mind

This comment has been minimized.

Show comment
Hide comment
@beau-mind

beau-mind Sep 6, 2011

We would like to use spellchecker too. Thank you.

We would like to use spellchecker too. Thank you.

@conradchu

This comment has been minimized.

Show comment
Hide comment
@conradchu

conradchu Oct 17, 2011

+1

+1

@mhluongo

This comment has been minimized.

Show comment
Hide comment
@mhluongo

mhluongo Oct 21, 2011

+1

+1

@tfreitas

This comment has been minimized.

Show comment
Hide comment
@tfreitas

tfreitas Oct 25, 2011

+1

+1

@fbecart

This comment has been minimized.

Show comment
Hide comment
@fbecart

fbecart Nov 1, 2011

+1

fbecart commented Nov 1, 2011

+1

@alexis779

This comment has been minimized.

Show comment
Hide comment
@alexis779

alexis779 Nov 2, 2011

+1

+1

@tfreitas

This comment has been minimized.

Show comment
Hide comment
@tfreitas

tfreitas Nov 4, 2011

+1

tfreitas commented Nov 4, 2011

+1

@dstendardi

This comment has been minimized.

Show comment
Hide comment
@dstendardi

dstendardi Nov 16, 2011

+1

+1

@adamw

This comment has been minimized.

Show comment
Hide comment
@adamw

adamw Dec 14, 2011

+1

adamw commented Dec 14, 2011

+1

@juliuss

This comment has been minimized.

Show comment
Hide comment
@juliuss

juliuss Dec 14, 2011

+1

juliuss commented Dec 14, 2011

+1

@bryangreen

This comment has been minimized.

Show comment
Hide comment
@bryangreen

bryangreen Dec 20, 2011

+1

+1

@abecciu

This comment has been minimized.

Show comment
Hide comment
@abecciu

abecciu Dec 20, 2011

+1

abecciu commented Dec 20, 2011

+1

@nickdunn

This comment has been minimized.

Show comment
Hide comment
@nickdunn

nickdunn Jan 6, 2012

Apologies for the +1, but this is way up my wishlist too.

nickdunn commented Jan 6, 2012

Apologies for the +1, but this is way up my wishlist too.

@ream88

This comment has been minimized.

Show comment
Hide comment
@ream88

ream88 Jan 8, 2012

Yep, me too! +1

ream88 commented Jan 8, 2012

Yep, me too! +1

@sebastianseilund

This comment has been minimized.

Show comment
Hide comment
@sebastianseilund

sebastianseilund Feb 5, 2012

+1
This would be an awesome feature, for an already awesome product! Thank you so much :)

+1
This would be an awesome feature, for an already awesome product! Thank you so much :)

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Mar 9, 2012

+1

ghost commented Mar 9, 2012

+1

@plentz

This comment has been minimized.

Show comment
Hide comment
@plentz

plentz Mar 24, 2012

+1

plentz commented Mar 24, 2012

+1

@gmatthew

This comment has been minimized.

Show comment
Hide comment
@gmatthew

gmatthew Mar 26, 2012

+1

+1

@krmcbride

This comment has been minimized.

Show comment
Hide comment
@krmcbride

krmcbride Apr 11, 2012

+1

+1

@j

This comment has been minimized.

Show comment
Hide comment
@j

j Apr 13, 2012

ping @kimchy It's been almost a year! :) Any status on this? Tonnnnns of +1's up in here!

j commented Apr 13, 2012

ping @kimchy It's been almost a year! :) Any status on this? Tonnnnns of +1's up in here!

@mhluongo

This comment has been minimized.

Show comment
Hide comment
@mhluongo

mhluongo Apr 13, 2012

Guys, I think @kimchy gets it... we all want this. However, Lucene 4.0 hasn't been released yet, and last update from him mentioned that that release would make this feature much easier. Maybe we should be pressuring the Lucene team to hurry up? There's been talk of a 4.0 release forever.

Guys, I think @kimchy gets it... we all want this. However, Lucene 4.0 hasn't been released yet, and last update from him mentioned that that release would make this feature much easier. Maybe we should be pressuring the Lucene team to hurry up? There's been talk of a 4.0 release forever.

@j

This comment has been minimized.

Show comment
Hide comment
@j

j Apr 15, 2012

@mhluongo, it is understood that it's a better "Lucene 4.0" feature, but there seems to be other options in relation to spell checking, etc. for example, #646. A lot of open source softwares don't wait over a year for a feature that the community wants.... a bridge could be made for searching, and when Lucene supports it directly, it can be BC to a temporary/secondary solution (ie. hunspell). i.e. Symfony2 PHP framework builds functionalities for PHP4.0 to get the minor optimization, but has a backup strategy for php versions of 3.x.

My two cents is that this is a huge feature in memory based searching... and would def. set elasticsearch apart from anything else out there right now.

Just my two cents IMO. :)

j commented Apr 15, 2012

@mhluongo, it is understood that it's a better "Lucene 4.0" feature, but there seems to be other options in relation to spell checking, etc. for example, #646. A lot of open source softwares don't wait over a year for a feature that the community wants.... a bridge could be made for searching, and when Lucene supports it directly, it can be BC to a temporary/secondary solution (ie. hunspell). i.e. Symfony2 PHP framework builds functionalities for PHP4.0 to get the minor optimization, but has a backup strategy for php versions of 3.x.

My two cents is that this is a huge feature in memory based searching... and would def. set elasticsearch apart from anything else out there right now.

Just my two cents IMO. :)

@mhluongo

This comment has been minimized.

Show comment
Hide comment
@mhluongo

mhluongo Apr 16, 2012

@jstout24 I know that waiting for Lucene 4 is just the path of least resistance, but there are a ton of other awesome features that we could use, as well, and that could be written/maintained in the time saved. At some point one of these +1's needs to start coding themselves if we want this feature, or be okay with waiting (I'm guilty of this too, obviously).

Just trying to be understanding of an embattled OSS developer :)

@jstout24 I know that waiting for Lucene 4 is just the path of least resistance, but there are a ton of other awesome features that we could use, as well, and that could be written/maintained in the time saved. At some point one of these +1's needs to start coding themselves if we want this feature, or be okay with waiting (I'm guilty of this too, obviously).

Just trying to be understanding of an embattled OSS developer :)

@bradbeattie

This comment has been minimized.

Show comment
Hide comment
@bradbeattie

bradbeattie Apr 18, 2012

To people "+1"ing, take a look over here: https://issues.apache.org/jira/browse/LUCENE/fixforversion/12314025. That's the progress of Lucene 4.0.

To people "+1"ing, take a look over here: https://issues.apache.org/jira/browse/LUCENE/fixforversion/12314025. That's the progress of Lucene 4.0.

@kimchy

This comment has been minimized.

Show comment
Hide comment
@kimchy

kimchy Apr 22, 2012

Member

Heya fellows, understood, this feature is highly important. The only thing that can be done currently (aside from other ways of solving it like using custom built index using ngrams and the like) is to possibly write a plugin (and probably new extensions points) to the current Lucene spell checking behavior. But, its not really good... (as I explained in my first comment here).

Member

kimchy commented Apr 22, 2012

Heya fellows, understood, this feature is highly important. The only thing that can be done currently (aside from other ways of solving it like using custom built index using ngrams and the like) is to possibly write a plugin (and probably new extensions points) to the current Lucene spell checking behavior. But, its not really good... (as I explained in my first comment here).

@DeeJayPee

This comment has been minimized.

Show comment
Hide comment
@DeeJayPee

DeeJayPee Aug 17, 2012

Hello,

Sorry but i have to +1 this issue too ^^
But now that lucene 4.0 is out, is it possible in any way or do we need an implementation in es ?

Regards,

Hello,

Sorry but i have to +1 this issue too ^^
But now that lucene 4.0 is out, is it possible in any way or do we need an implementation in es ?

Regards,

@brusic

This comment has been minimized.

Show comment
Hide comment
@brusic

brusic Aug 17, 2012

Contributor

Lucene 4.0 is not out, only the beta. Final release probably will not happen until October.

Contributor

brusic commented Aug 17, 2012

Lucene 4.0 is not out, only the beta. Final release probably will not happen until October.

@louman

This comment has been minimized.

Show comment
Hide comment
@louman

louman Sep 9, 2012

+1

louman commented Sep 9, 2012

+1

@elfassy

This comment has been minimized.

Show comment
Hide comment
@elfassy

elfassy Sep 14, 2012

+1

elfassy commented Sep 14, 2012

+1

@Fibonacci-Solucoes-Ageis

This comment has been minimized.

Show comment
Hide comment
@Fibonacci-Solucoes-Ageis

Fibonacci-Solucoes-Ageis Sep 23, 2012

+1

@maik2102

This comment has been minimized.

Show comment
Hide comment
@maik2102

maik2102 Sep 24, 2012

+1

+1

@tfreitas

This comment has been minimized.

Show comment
Hide comment
@tfreitas

tfreitas Oct 8, 2012

+1

tfreitas commented Oct 8, 2012

+1

@schmurfy

This comment has been minimized.

Show comment
Hide comment
@schmurfy

schmurfy Oct 9, 2012

I think we all know now that many people are interested in this feature, can we stop with the +1 please ?
They serve little to no purpose and spam anyone who is watching this thread for real informations.

schmurfy commented Oct 9, 2012

I think we all know now that many people are interested in this feature, can we stop with the +1 please ?
They serve little to no purpose and spam anyone who is watching this thread for real informations.

@kul

This comment has been minimized.

Show comment
Hide comment
@kul

kul Oct 13, 2012

Contributor

4.0 is Out! :)

Contributor

kul commented Oct 13, 2012

4.0 is Out! :)

@herlambang

This comment has been minimized.

Show comment
Hide comment
@herlambang

herlambang Oct 17, 2012

+1

+1

@brusic

This comment has been minimized.

Show comment
Hide comment
@brusic

brusic Oct 17, 2012

Contributor

I agree with schmurfy, enough with +1s. If you want to subscribe to this issue, you can change your notification settings below. Look for the dropdown that says "Not watching thread" and change it to "Watch".

Shay commented on spellchecking and Lucene 4.0 last week. In case you missed it, here is the thread:
https://groups.google.com/d/topic/elasticsearch/p2mu0Tv3VPI/discussion

"The plan is the first get Lucene 4.0 integrated with elasticsearch, and then expose all the new features. We will take it feature by feature, but to your points, there will be a spellcheck builtin using the new "direct" spellcheck feature, you will be able to configure codecs in the mapping, and write a plugin that introduces new codes, and so on..."

Contributor

brusic commented Oct 17, 2012

I agree with schmurfy, enough with +1s. If you want to subscribe to this issue, you can change your notification settings below. Look for the dropdown that says "Not watching thread" and change it to "Watch".

Shay commented on spellchecking and Lucene 4.0 last week. In case you missed it, here is the thread:
https://groups.google.com/d/topic/elasticsearch/p2mu0Tv3VPI/discussion

"The plan is the first get Lucene 4.0 integrated with elasticsearch, and then expose all the new features. We will take it feature by feature, but to your points, there will be a spellcheck builtin using the new "direct" spellcheck feature, you will be able to configure codecs in the mapping, and write a plugin that introduces new codes, and so on..."

@tfreitas

This comment has been minimized.

Show comment
Hide comment
@tfreitas

tfreitas Jan 29, 2013

+1

+1

@brunobowden

This comment has been minimized.

Show comment
Hide comment
@brunobowden

brunobowden Feb 2, 2013

+1
I'd particularly like to use it when it's deployed on StackOverflow

+1
I'd particularly like to use it when it's deployed on StackOverflow

@schmurfy

This comment has been minimized.

Show comment
Hide comment
@schmurfy

schmurfy Feb 6, 2013

seriously can we stop with the +1 ? There is a watch thread button at the bottom of the page if you want to be notified of any changes here which won't send a notification to anyone watching this.

schmurfy commented Feb 6, 2013

seriously can we stop with the +1 ? There is a watch thread button at the bottom of the page if you want to be notified of any changes here which won't send a notification to anyone watching this.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Apr 4, 2013

Member

See the suggest API added in 0.90

Member

clintongormley commented Apr 4, 2013

See the suggest API added in 0.90

@ymiao

This comment has been minimized.

Show comment
Hide comment
@ymiao

ymiao Nov 23, 2016

does phrase suggester support Chinese ?

ymiao commented Nov 23, 2016

does phrase suggester support Chinese ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment