Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If tokenized original query is different from keyword in synonym file, this query parser doesn't make synonym search phrase. #26

Closed
jhsuh opened this issue Jul 1, 2013 · 1 comment
Milestone

Comments

@jhsuh
Copy link

jhsuh commented Jul 1, 2013

When I don't use KeywordTokenizer or WhitespaceTokenizer, sometimes the original query is different from the tokenized(parsed) query.
But...
If tokenized original query is different from the keyword in synonym file, this query parser doesn't find synonyms and make synonym search phrases.
I think that for synonym, this query parser has to check the tokenized original query and the original query too.

# tokenizer : StandardTokenizer
# synonyms.txt
血と骨,Blood and Bones
wi-fi, wifi
wifi ==> OK
http://10.141.15.112:8983/solr/select?qf=Title_t&q=wifi&defType=synonym_edismax&synonyms=true&debugQuery=true&q.op=AND&synonyms.constructPhrases=true&synonyms.originalBoost=1.1&synonyms.synonymBoost=0.9&rows=0
+((Title_t:wifi)^1.1 (((+(Title_t:"wi fi")) (+(Title_t:wifi)))^0.9))

wi-fi ==> can't find synonyms
http://10.141.15.112:8983/solr/select?qf=Title_t&q=wi-fi&defType=synonym_edismax&synonyms=true&debugQuery=true&q.op=AND&synonyms.constructPhrases=true&synonyms.originalBoost=1.1&synonyms.synonymBoost=0.9&rows=0
+(((Title_t:wi Title_t:fi)~2))
Blood and Bones => OK
http://10.141.15.112:8983/solr/select?qf=Title_t&q=Blood%20and%20Bones&defType=synonym_edismax&synonyms=true&debugQuery=true&q.op=AND&synonyms.constructPhrases=true&synonyms.originalBoost=1.1&synonyms.synonymBoost=0.9&rows=0
+(((+(Title_t:blood) +(Title_t:bones))^1.1) (((+(Title_t:"blood and bones")) (+(Title_t:"血 と 骨")))^0.9))

血と骨 ==> can't find synonyms
http://10.141.15.112:8983/solr/select?qf=Title_t&q=%E8%A1%80%E3%81%A8%E9%AA%A8&defType=synonym_edismax&synonyms=true&debugQuery=true&q.op=AND&synonyms.constructPhrases=true&synonyms.originalBoost=1.1&synonyms.synonymBoost=0.9&rows=0
+(((Title_t:血 Title_t:と Title_t:骨)~3)

血 と 骨 ==> can't find synonyms
http://10.141.15.112:8983/solr/select?qf=Title_t&q=%E8%A1%80%20%E3%81%A8%20%E9%AA%A8&defType=synonym_edismax&synonyms=true&debugQuery=true&q.op=AND&synonyms.constructPhrases=true&synonyms.originalBoost=1.1&synonyms.synonymBoost=0.9&rows=0
+(((Title_t:血) (Title_t:と) (Title_t:骨))~3)
@nolanlawson
Copy link
Member

It seems like the wifi/wi-fi problem is a duplicate of #32. The 血と骨/blood and bones problem (great movie, by the way! :-)) seems to be a duplicate of #9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants