Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

SOlR update

  • Loading branch information...
commit 4d2dfeedc1dc2fe3693438325ce9fcfb452004a5 1 parent fa7d451
@Valve authored
View
136 source/_posts/2014-02-22-rails-developer-guide-to-full-text-search-with-solr.markdown
@@ -130,7 +130,7 @@ It's quite possible to use a
specialized [Point](http://www.postgresql.org/docs/current/static/datatype-geometric.html#AEN6547) data type, but I want to keep it simple here.
I make `lat` & `lon` attributes nullable in case a user
-denies the browser geolocation permission and his profiles is saved without those values.
+denies the browser geolocation permission and his profile is saved without those values.
Let's create the databases and run the migration.
@@ -341,16 +341,17 @@ directly is to send rather cryptic XML requests.
2. Nobody wants to mess with raw XML over HTTP, so here enters RSolr - a wrapper around Solr HTTP API that allows interacting with Solr from Ruby code.
3. However RSolr is still rather low-level and does not provide any DSL or convenience methods
to define which Rails models should be searchable and how the indexes will be updated.
-The need for a new library was apparent, so the sunspot was born. A really nice DSL that
+The need for a new library was apparent, so the Sunspot was born. A really nice DSL that
integrates directly into ActiveRecord models and allows to specify which attributes we need
to index, how to transform and query the data.
Now you're saying: "_I still don't understand, if the Solr is a Java service it means
-I need to install and configure it on my system? That's a horrible perspective, get me out of this!_". Absolutely not. Sunspot gem is bundled with a development version of solr and has a nice set of rake tasks to manage it. You can start, stop, reindex the data, all using rake tasks. There is no need to install Solr manually, all you need is to add two gems:
+I need to install and configure it on my system? That's a horrible perspective, get me out of this!_". Absolutely not. Sunspot gem is bundled with a development version of Solr and has a nice set of rake tasks to manage it. You can start, stop, reindex the data, all using rake tasks. There is no need to install Solr manually, all you need is to add two gems:
`sunspot_solr` and `sunspot_rails`.
-`sunspot_solr` is the pre-packaged development version of Solr and `sunspot_rails` is the sunspot gem itself. So you need to make sure you place the `sunspot_solr` into `:development` group in your Gemfile.
+`sunspot_solr` is the pre-packaged development version of Solr and `sunspot_rails`
+is the Sunspot gem itself. So you need to make sure you place the `sunspot_solr` into `:development` group in your Gemfile.
OK, now that confusion is hopefully out of the way, let's continue with our people search
scenario.
@@ -363,7 +364,7 @@ class Person < ActiveRecord::Base
searchable do
text :name, boost: 5.0
- text :about, :likes, :dislikes
+ text :about, :likes
latlon(:location) { Sunspot::Util::Coordinates.new(lat, lon) }
end
end
@@ -371,12 +372,12 @@ end
Let's break it down piece by piece:
-1. `searchable` block is a place where you define the full-text behavior.
+1. `searchable` block is a place where you define the full-text indexing behavior.
Inside this block you can specify various rules describing which attributes should
be indexed, their pre-index transformations, facets, filters and so on.
2. `text :name` - person should be searchable by its name. By searchable I mean full-text searchable.
-3. `boost: 5.0` - boost option tells solr to prioritize the results found by this particular attribute. If you're searching for `John Doe`, all the people with such name will come first, and only after them those, who dislike Johns Doe (or John Does, I don't know which is correct).
-4. `text :about, :likes, :dislikes` - person should be searchable by these attributes.
+3. `boost: 5.0` - boost option tells Solr to prioritize the results found by this particular attribute. If you're searching for `John Doe`, all the people with such name will come first, and only after them those, who dislike Johns Doe (or John Does, I don't know which is correct).
+4. `text :about, :likes` - person should be searchable by these attributes.
5. `latlon(:location) { Sunspot::Util::Coordinates.new(lat, lon) }` - create a geo-spatial
index on person's location using `lat` and `lon` attributes. This will allow to search for
people within a certain mile radius.
@@ -388,7 +389,7 @@ Now we're ready to actually search for people.
Let us add a `_person` partial where search result item will be displayed:
{% codeblock lang:erb %}
-# app/views/people/_person.html.erb
+<%# app/views/people/_person.html.erb %>
<div class="person">
<h4><%= person.name %> </h4>
<h5>About:</h5>
@@ -404,7 +405,7 @@ Let us add a `_person` partial where search result item will be displayed:
We also need to add the iteration to the `index` view:
{% codeblock lang:erb %}
-# app/views/people/index.html.erb
+<%# app/views/people/index.html.erb %>
<% @people.each do |person| %>
<%= render partial: 'person', locals: {person: person} %>
<% end %>
@@ -413,6 +414,7 @@ We also need to add the iteration to the `index` view:
So the view is ready, let's modify the controller code:
{% codeblock lang:ruby %}
+# app/controllers/people_controller.rb
def index
if current_user
if params[:search].present? || params[:radius].present?
@@ -432,36 +434,150 @@ def index
end
{% endcodeblock %}
+On line `3` we check if current user is saved, on line `4` we verify we have something to
+search by, either a search term or a radius. Then on lines `5 - 10` is where the actual
+full-text search happens. We use a `Model.search` method and pass it a block.
+Inside this block we need to specify the logic of the search.
+In our case we call `fulltext` method and pass it our search term.
+Let me be clear, we have two phases: **indexing** and **searching**. Indexing is defined
+inside a model in a `searchable` block. You use `text` method to specify which attributes
+should be full-text searchable.
+Searching is done by calling `Model.search` method and passing it a block too. But this time
+we call `fulltext` method to actually do full-text search on indexed attributes.
+OK, we know know how to do full-text search on text attributes, we're already doing it on
+`name`, `about` and `likes` attributes. What we also need is a way to restrict the results
+to a certain radius on a map. This is what lines `7 - 9` are for.
+In out application it's possible that user denies a geolocation permission and his
+profile is saved without coordinates. So we need a convenience method to see
+if current user has a location or not:
+{% codeblock lang:ruby %}
+# app/models/person.rb
+def has_location?
+ lat && lon
+end
+{% endcodeblock %}
+This method is useful in `Person.search` block where we specify the search radius:
+
+{% codeblock lang:ruby %}
+# app/controllers/people_controller.rb
+if current_user.has_location?
+ with(:location).in_radius(current_user.lat, current_user.lon, params[:radius])
+end
+{% endcodeblock %}
+We're using current user's `lat` & `lon` attributes and the radius from params to perform the
+filtering. You should remember to convert miles to kilimeters, because Sunspot operates on
+kilometers.
+OK, first version of the people search is ready to try, let's run it.
+Works fine, but when I search for someone within 10 mile radius, I find myself too.
+There should be a way to search for _other_ people, excluding myself. Let's fix it.
+Sunspot allows using attributes as filters. For this we should call methods like `integer`,
+`string`, `datetime` etc. In this case we need to search for all people except those
+with `:id` equal to the `:id` of current user. We also need to filter out the people with
+`dislikes` equal to the search term:
+{% codeblock lang:ruby %}
+# app/models/person.rb
+searchable do
+ text :name, boost: 5.0
+ text :about, :likes
+ integer (:id)
+ string(:dislikes)
+ latlon(:location) { Sunspot::Util::Coordinates.new(lat, lon) }
+end
+{% endcodeblock %}
+On line `5` we're creating an indexed filter on `:id` column, and on the next line a filter on
+`:dislikes` column.
+Now the filtering itself:
+{% codeblock lang:ruby %}
+# app/controllers/people_controller.rb
+def index
+ if current_user
+ if params[:search].present? || params[:radius].present?
+ search = Person.search do
+ without(:id, current_user.id)
+ without(:dislikes, params[:search]) if params[:search].present?
+ fulltext params[:search]
+ if current_user.has_location?
+ with(:location).in_radius(current_user.lat, current_user.lon, params[:radius])
+ end
+ end
+ @people = search.results
+ else
+ @people = []
+ end
+ else
+ @new_person = Person.new
+ end
+end
+{% endcodeblock %}
+On line `6` we're filtering out people with `:id` equal to current user's id.
+On line `7` we're filtering out people who dislike stuff I'm searching for.
+What does `Person.search` return? It's a special Sunspot object that has a `results`
+method. So to grab actual active record items, we use `@people = search.results` code.
+Finally we have all pieces of the puzzle. If we run the app now we should be able to save
+current user's profile and then go search for other people.
+In this article I've barely scratched the surface of the Solr & Sunspot capabilities. You should definitely look for more in the documentation if you want to create a full-featured
+application.
+### By why should I use fulltext search if I can do everything in SQL?
+You're right, except you can't.
+Full text search is a huge topic with a huge set of capabilities.
+It can do synonyms search, wildcard search, stemming
+and a [lot, lot more](https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters).
+Solr can be as intelligent as to perform a word decomposition during search, operate on
+word parts and generally behave as a human (almost).
+Full-text search is faster too. How much faster? This is a tricky question, because it all
+depends on the indexed data, but one can safely assume it can be at least several times
+faster than equivalent SQL searching. For complex searches Solr can be orders of magnitude
+faster than SQL.
+### How is new data indexed?
+Sunspot handles it for you. It registers a set of hooks that trigger the automatic indexing
+of updated and new records. If you look into rails log, you'll see something like:
+{%codeblock %}
+ SOLR Request (455.4ms) [ path=update parameters={} ]
+ (1.1ms) COMMIT
+Redirected to http://localhost:3000/people
+ SOLR Request (60.9ms) [ path=update parameters={} ]
+{%endcodeblock %}
+### How do I test it?
+You should generally avoid touching Solr in unit tests. Either design your tests to avoid
+talking to Solr in unit tests, or just stub Solr to return pre-canned results.
+As for integration tests, indexing data before running them worked best for me.
+I first prepare some test data, then I reindex it with:
+`rake sunspot:reindex`
+and then run the integration tests.
+If you find the topic of testing interesting, drop me a line, I'll cover it in the next article.
+### Code
+https://github.com/Valve/neibo
+Well, I hope the explanation wasn't too packed, share your ideas in the comments :)

0 comments on commit 4d2dfee

Please sign in to comment.
Something went wrong with that request. Please try again.