Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search with UTF-8 character #167

Closed
davidsf opened this issue Jan 13, 2012 · 7 comments
Closed

Search with UTF-8 character #167

davidsf opened this issue Jan 13, 2012 · 7 comments

Comments

@davidsf
Copy link

davidsf commented Jan 13, 2012

Hi, with 1.3.0 and rsolr 1.0.6 I have problems with search with UTF-8 characters: 0 results.

For example, searching for 'japón', in Solr logs I see q=japón

No errors about encoding appears in anyt place, through .

I have see that there are others issues resolved about UTF-8, maybe this is other issue.

@davidsf
Copy link
Author

davidsf commented Jan 16, 2012

It's seems that is something in sunspot. With just rsolr works:

ruby-1.9.2-p180 :001 > require 'rsolr'
 => true
ruby-1.9.2-p180 :002 > RSolr.version
 => "1.0.6"
ruby-1.9.2-p180 :003 > solr = RSolr.connect :url => "http://<url>"
ruby-1.9.2-p180 :004 > s = solr.get('select', :params => {:q => 'japón' })

I see in the solr logs appears the search correctly:

INFO: [core0] webapp=/solr path=/select params={wt=ruby&q=japón} hits=2023 status=0 QTime=7

With the same query with Sunspot (master git) the logs appears q=japón in the solr logs...

@brutuscat
Copy link
Contributor

I'm running ruby 1.9.3 with lastest Sunspot from master and I'm NOT seeing this behavior. For example in one of my queries I have:

q=%C2%BFC%C3%B3mo+lo+hace%5C%3F

Which is: q=¿Cómo lo hace?

And seems to work just fine.

It maybe that your string.encoding may not be the correct one...

@davidsf
Copy link
Author

davidsf commented Jan 17, 2012

In rails console the string.encoding seems ok:

    a="a"
    a.encoding
     => #<Encoding:UTF-8>

    Product.search do
         keywords(a)
    end

The above code produces 0 hits and the wrong encoding query in solr logs.

@wulfman
Copy link

wulfman commented Jan 17, 2012

I have the same problem with german "Umlaute" (since I change to 1.3.0 and rsolr 1.0.6).

Searching for "Büro" from rails appears as "q=Büro", same search from solr-admin-console appears as "q=Büro" in logfile.

I've added "URIEncoding="UTF-8"" in server.xml configuration ... that helps for older sunspot version 1.2.1, rsolr 0.12.1.
but solves not the problem with newer sunspot version.

@wulfman
Copy link

wulfman commented Jan 17, 2012

guess the problem is line 193 in rsolr/client.rb

substitute
opts[:headers]['Content-Type'] ||= 'application/x-www-form-urlencoded'
with
opts[:headers]['Content-Type'] ||= 'application/x-www-form-urlencoded; charset=utf-8'
solves the problem (temporary)

But how to configure the headers from the app?

@wulfman
Copy link

wulfman commented Jan 17, 2012

guess this had been already merged to rsolr but not builded until now. So you have to wait or build it by your self ;)

@davidsf
Copy link
Author

davidsf commented Jan 17, 2012

@wulfman The change works for me. Thanks a lot!

With this line in Gemfile works meanwhile the next release of rsolr:

gem 'rsolr', :git => "https://github.com/mwmitchell/rsolr.git"

Closing the issue because is a rsolr bug.

@davidsf davidsf closed this as completed Jan 17, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants