-
Notifications
You must be signed in to change notification settings - Fork 36
Sunspot 2.0 README
The purpose of this document is to provide a framework for the development of Sunspot 2.0 using the README-driven development model. Sunspot 2.0 does not yet exist, but this document aims to describe the functionality that we hope to achieve when we build it.
Just add Sunspot to your Gemfile
:
group :production do
gem 'sunspot-client' # The gem-packaged Solr server isn't appropriate for production
end
group :test, :development do
gem 'sunspot' # Meta-package of sunspot-client and sunspot-server
end
Install the gem from your shell:
sudo gem install sunspot
# To install the optional packaged Solr server (recommended for development):
sudo gem install sunspot_solr
Then add the dependency to your environment.rb
:
config.gem 'sunspot'
Note: The sunspot_rails
gem no longer needs to be installed. As of version 2.0, Sunspot will automatically include Rails integration if you load it in a Rails environment; it still works fine in a non-Rails environment as well.
Sunspot can be run without any explicit configuration, but if you're using Rails, the easiest way to maintain a consistent configuration across your team is to use the built-in generator to add configurations to your project:
script/rails generate sunspot
This will create the following new files in the current directory:
config/sunspot.yml
solr/conf/schema.xml
solr/conf/solrconfig.xml
solr/data/.gitignore
The sunspot.yml
file is used for application-level configuration of Sunspot. This includes a URL at which Sunspot can access Solr; if the hostname of this URL is localhost
or 127.0.0.1
, the sunspot-solr
executable will also use the configuration you've specified to start up the bundled Solr instance. An example sunspot.yml
might look like this:
production:
solr: http://solr.my-host.com/solr
development:
solr:
url: http://localhost:8982/solr
max_memory: 1024M
# TK more configuration options here
The files in the solr/conf
directory are used directly by the bundled Solr instance when you run it locally. You probably won't need to change it, but if you need advanced customization of Solr's behavior, these files are where you can do that. The solr/data
directory contains your actual Solr index on disk, and is thus excluded from Git for you.
To start Solr in your development environment, simply run:
sunspot solr start # sunspot_solr gem must be installed
If you run this from the root of a Rails project, Sunspot will detect that and use your config/sunspot.yml
if it's present.
Sunspot is designed to index and search Ruby objects that are persisted to a separate primary data store. Sunspot supports ActiveRecord, DataMapper, Mongoid, and MongoMapper [TK what else?] out of the box; it's quite easy to add support for other persistence layers. See the documentation for Sunspot::Adapter.
Configuring a model class for search primarily consists of defining which fields Sunspot should index, and setting those fields up with various options. Fields do not need to correspond to database columns; Sunspot will happily any index the return value of any method your object responds to.
The examples in this README all assume we're building a straightforward blogging platform. Let's start with a simple configuration for our Post
model.
class Post < ActiveRecord::Base
include Sunspot::Searchable
index :body
index :blog_id
end
If your field names correspond to database columns (or predefined fields in Mongoid, etc.), Sunspot will infer the Solr field type from the column type. String columns are inferred as fulltext fields; other types are inferred as attribute fields of the corresponding type. Fulltext fields and attribute fields are quite different in their properties and usage.
Fulltext fields always have the type fulltext
, and are used for keyword search. Solr breaks apart the data from fulltext fields into individual words, and when a fulltext search is performed, documents are matched against search terms on a word-by-word basis.
Attribute fields, on the other hand, are scalar data, and are indexed as-is without any analysis. Attribute fields can have several scalar types: string
, integer
, long
, float
, double
, date
, time
, and boolean
are the main ones. You can think of attribute fields as equivalent to columns in a database: they can be used for filtering search results to a certain scope (e.g. only return results with a blog_id
of 1); ordering results; and faceting, a topic we will cover in more depth later in this README.
The above example uses the simplest method of populating fields; Sunspot will simply call the method named by the field, and index the return value if it's non-nil. If you wish to give the field a different name from the method that populates it, use the :using
option:
class Post < ActiveRecord::Base
index :my_blog_id, :using => :blog_id
end
This will populate a my_blog_id
in Solr using the return value of the Post#blog_id
method.
If you wish to populate a field with data that is not defined by a method on your model class, you can pass a block to the field definition; the block is evaluated in the context of the model instance, and the return value is indexed. For instance, perhaps we wish to index the number of comments on a given post:
class Post < ActiveRecord::Base
has_many :comments
index(:comments_count, :as => :integer) { comments.count }
end
In this case, since the field does not correspond directly to a database column, we must explicitly specify the field type. If Sunspot cannot infer the field type and no type is specified, it will assume it is fulltext.
A special type of attribute field is a reference field. These are fields that hold references to other persistent objects; they're particularly useful for faceting. For example, instead of our blog_id
field above, we might simply index blog
as a reference field:
class Post < ActiveRecord::Base
belongs_to :blog
index :blog
end
Now instead of working with an integer when using this field, we'll be working with actual Blog
objects.
Reference fields can also be passed a block, which is an easy way to index data from associated objects. Again, Sunspot will attempt to infer the field's type from the column type in the associated object.
class Post < ActiveRecord::Base
belongs_to :blog
index :blog do
index :name
end
end
TK
The following options are available on all fields:
:stored
- By default, Sunspot does not add field data to Solr in a way that allows Solr to return that field data in search results; instead, Sunspot only stores the object's class name and primary key, and uses that information from the search result to load the original object out of the primary database. You can override this behavior on a per-field basis to instruct Solr to return the field data in search results; in certain cases, this can allow you to bypass looking up the original objects in the database altogether, giving you a performance boost.
:as
- **Advanced.** Usually, Sunspot constructs an internal field name for your fields based on the field type and options you've set; Sunspot's built-in Solr schema is set up to follow the same naming conventions. In certain cases, such as legacy schemas or for functionality not supported by Sunspot, you may want to override this and directly set the field name that will be used internally.
TK
TK
TK
$ sunspot reindex
TK
TK
TK
TK
TK
TK
TK
TK
Post.search do
where(:blog_id => 1)
where(:comments_count).gt(0)
end
TK
TK
TK
TK
TK