Skip to content

Working with search

bensie edited this page Aug 11, 2011 · 14 revisions

Initiating a search

Sunspot searches are initiated with the Sunspot.search method; the arguments are one or more classes to search for, and a block is (optionally) passed to construct the search. In the simplest case, a search for all instances of a class, no block is passed:

search = Sunspot.search(Post)

To search for more than one class at the same time, just pass multiple classes:

search = Sunspot.search(Post, Comment)

If you’re using Sunspot::Rails in a Rails application, you can call the `search` method directly on your model class:

search = Post.search

Building non-trivial searches using the block DSL is covered in the following chapters; this chapter covers working with the search results.

Getting search results

To get the full range of result information from your search, use the #hits method. This returns a collection of Sunspot::Search::Hit objects; each hit encapsulates data about a single search result.

To get the actual model object referenced by a hit, call Hit#result. If all you want is the model objects returned by the search, and don’t care about any of the search metadata associated with the result (like relevance score, geo distance, stored fields, keyword highlights, etc.), then you can also call the #results method directly on your search object. This returns a collection of model objects.

Working with WillPaginate

If WillPaginate has been loaded, Sunspot will automatically integrate the hits and results methods with it. So, in your template, this will work:

<div class="pagination">
  <%= will_paginate(@search.hits) %>
</div>

You need to submit the page parameter to the search like this:

Sunspot.search(Post) do
  paginate(:page => params[:page])
end

Working with hit metadata

Beyond the actual result instances, Solr provides a few other pieces of information with search hits. If the setup for the class of the result has stored fields, Solr will return the stored field values. And if the search being performed includes a keyword component, Solr will return a relevance score. The below example uses the #each_hit_with_result method, a convenience iterator which yields each Hit object with its result model (in this case, let’s say we searched for Post instances):

<div class="results">
  <% @search.each_hit_with_result do |hit, post| -%>
    <div class="result">
      <h2><%= hit.stored(:title) %></h2>
      <div class="score"><%= hit.score %></div>
      <p><%= h(post.body) %></p>
    </div>
  <% end -%>
</div>

Note that when working with Hit objects, the actual object referenced by the Hit is not instantiated unless you call the result method, which then populates all of the Hit objects for the search in a batch. Clearly, stored fields are primarily useful if they contain enough relevant data that you don’t need to instantiate the result objects.

Keyword highlighting

If you requested keyword highlighting on your search, the Hit objects will give you access to highlighted phrases via the highlights method. You can optionally pass a field name to that method, in which case you’ll only get the highlights in that field; otherwise, you’ll get any matching highlights. There’s also the #highlight method, which must be passed a field name and returns the first highlight for the given field, if there is one.

Highlight objects expose the #format method, which takes a block that formats each highlighted fragment. This allows you to put highlight formatting logic where it belongs: in your view layer.

Here’s an example of working with highlighted search results:

<ul class="search_results">
  <% @search.each_hit_with_result do |hit, post| -%>
    <li>
      <h3><%= h(post.title) %></h3>
      <p class="summary"><%= hit.highlight(:body).format { |fragment| content_tag(:em, fragment) } %></p>
    </li>
  <% end -%>
</ul>

In that example, each highlighted keyword is wrapped in an <em> tag.

Assumed inconsistency and verified hits

Solr is currently not built to handle “real-time” search; in other words, an application that experiences highly frequent writes can’t expect to always have its Solr index completely up to date. While having your index out of sync for a few seconds or minutes after an add or update operation is usually tolerable, after a delete operation an out-of-sync state can be highly problematic. If your search is returning results that don’t actually exist in your database, you may end up with broken links in your search result page.

Sunspot operates on a doctrine of “assumed inconsistency”: it doesn’t break if the Solr results reference an object that doesn’t actually exist in the database. Instead, when using the #results and #each_hit_with_result method, Sunspot will simply return the results that it finds in your data store, and throw away the references to results that it doesn’t.

The #hits method operates a bit differently, because it is designed to allow you to work with Solr search data without ever touching your database, if you so desire. Thus, it doesn’t by default check the results against the data store. If you want to return only Hit objects that reference results that are confirmed to exist in your data store, pass :verify => true as an option into the #hits method.

Other data in your search

To get the total number of results in the index matching your search criteria, use the Search#total method.

Working with facets

Facet results are retrieved using the Search#facet method, which takes a single argument: for field and time-range facets, the argument is the name of the field being faceted upon; for query facets, the argument is whatever name you gave the facet when building the search.

All facet objects expose a single method, rows, which returns a collection of FacetRow objects. FacetRows expose two methods: count, which tells the total number of documents matching the search results with the row’s value (or matching the row’s query); and value. For field facets, value is simply the value for the given field associated with that row; for time-range facets, value is a Range object representing the time range associated with that row; and for query facets, value is whatever you defined it as when building the search.

Instantiated facets

For fields set up with the :references option, Sunspot will create instantiated facets in place of normal field facets. Instantiated facet rows respond to the instance method, which loads the instance whose primary key is the value of the facet row. As with hits, instantiated facet instances are lazy-loaded, but when instance is called on any facet row, all the instances for that facet will be batch-loaded.

Here’s an example of working with instantiated facet results:

<div class="facets">
  <h3>Browse by Category</h3>
  <ul class="facet">
    <% for row in @search.facet(:category_ids).rows -%>
      <li><%= link_to(row.value, url_for(:category_id => row.value)) %> (<%= row.count %>)</li>
    <% end -%>
  </ul>
</div>