public
Description: A Ruby client for Apache Solr
Homepage:
Clone URL: git://github.com/mwmitchell/rsolr.git
rsolr /
name age message
file .gitignore Thu Nov 12 18:06:58 -0800 2009 ignoring "coverage"; fixed *.template ignore [mwmitchell]
file CHANGES.txt Tue Nov 17 18:53:57 -0800 2009 updated CHANGES [mwmitchell]
file LICENSE Thu Sep 03 12:53:02 -0700 2009 updated license to reflect the removal of Mash ... [mwmitchell]
file README.rdoc Tue Nov 17 18:10:58 -0800 2009 removed pagination, yep you heard me [mwmitchell]
file Rakefile Mon Oct 26 18:01:51 -0700 2009 made spec:api the default task [mwmitchell]
file TODO.txt Fri Nov 13 10:38:32 -0800 2009 updated TODO [mwmitchell]
directory examples/ Mon Nov 16 20:01:06 -0800 2009 added exception handling for page/per-page valu... [mwmitchell]
directory lib/ Tue Nov 17 18:10:58 -0800 2009 removed pagination, yep you heard me [mwmitchell]
file rsolr.gemspec Tue Nov 17 18:10:58 -0800 2009 removed pagination, yep you heard me [mwmitchell]
directory solr/ Thu Nov 12 17:59:16 -0800 2009 added solr distribution; ignoring .template files [mwmitchell]
directory spec/ Tue Nov 17 18:10:58 -0800 2009 removed pagination, yep you heard me [mwmitchell]
directory tasks/ Tue Nov 17 05:54:06 -0800 2009 added pagination specs, removed integration spe... [mwmitchell]
README.rdoc

RSolr

A Ruby client for Apache Solr. RSolr has been developed to be simple and extendable. It features transparent JRuby DirectSolrConnection support and a simple Hash-in, Hash-out architecture.

Installation:

  gem sources -a http://gemcutter.org
  sudo gem install rsolr

Related Resources & Projects

Simple usage:

  require 'rubygems'
  require 'rsolr'
  solr = RSolr.connect :url=>'http://solrserver.com'

  # send a request to /select
  response = rsolr.select :q=>'*:*'

  # send a request to a custom request handler; /catalog
  response = rsolr.request '/catalog', :q=>'*:*'

  # alternative to above:
  response = rsolr.catalog :q=>'*:*'

To use a DirectSolrConnection (no http) in JRuby:

  # "apache-solr" should be a path to a solr build.
  Dir['apache-solr/dist/*.jar'].each{|jar|require jar}
  Dir['apache-solr/lib/*.jar'].each{|jar|require jar}

  opts = {:home_dir=>'/path/to/solr/home'}

  # note: you'll have to close the direct connection yourself unless using a block.
  solr = RSolr.direct_connect(opts)
  solr.select :q=>'*:*'
  solr.connection.close

  # OR using a block for automatic connection closing:
  RSolr.direct_connect opts do |solr|
    solr.select :q=>'*:*'
  end

In general, the direct connection is less than ideal in most applications. You’ll be missing out on Http caching, and it’ll be impossible to do distributed searches. The direct connection could possibly come in handy though, for quickly indexing large numbers of documents.

For more information about DirectSolrConnection, see the API.

Querying

Use the #select method to send requests to the /select handler:

  response = solr.select({
    :q=>'washington',
    :start=>0,
    :rows=>10
  })

The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters are generated for the Solr query. Example:

  solr.select :q=>'roses', :fq=>['red', 'violet']

The above statement generates this Solr query:

  ?q=roses&fq=red&fq=violet

Use the #request method for a custom request handler path:

  response = solr.request '/documents', :q=>'test'

A shortcut for the above example:

  response = solr.documents :q=>'test'

Updating Solr

Updating can be done using native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages". Raw XML strings can also be used.

Raw XML via #update

  solr.update '</commit>'
  solr.update '</optimize>'

Single document via #add

  solr.add :id=>1, :price=>1.00

Multiple documents via #add

  documents = [{:id=>1, :price=>1.00}, {:id=>2, :price=>10.50}]
  solr.add documents

When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) when using the #add method:

  doc = {:id=>1, :price=>1.00}
  add_attributes = {:allowDups=>false, :commitWithin=>10.0}
  solr.add(doc, add_attributes) do |doc|
    # boost each document
    doc.attrs[:boost] = 1.5
    # boost the price field:
    doc.field_by_name(:price).attrs[:boost] = 2.0
  end

Delete by id

  solr.delete_by_id 1

or an array of ids

  solr.delete_by_id [1, 2, 3, 4]

Delete by query:

  solr.delete_by_query 'price:1.00'

Delete by array of queries

  solr.delete_by_query ['price:1.00', 'price:10.00']

Commit & optimize shortcuts

  solr.commit
  solr.optimize

Response Formats

The default response format is Ruby. When the :wt param is set to :ruby, the response is eval’d resulting in a Hash. You can get a raw response by setting the :wt to "ruby" - notice, the string — not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>’xml’ etc..

Evaluated Ruby (default)

  solr.select(:wt=>:ruby) # notice :ruby is a Symbol

Raw Ruby

  solr.select(:wt=>'ruby') # notice 'ruby' is a String

XML:

  solr.select(:wt=>:xml)

JSON:

  solr.select(:wt=>:json)

You can access the original request context (path, params, url etc.) by calling the #raw method:

  response = solr.select :q=>'*:*'
  response.raw[:status_code]
  response.raw[:body]
  response.raw[:url]

The raw is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.