Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
63 lines (40 sloc) 1.98 KB

Sunspot Cell (gem)

Note by Zheileman

This gem adds Cell support (for indexing rich documents like pdf, docs, html, etc…) to Sunspot (developed against Sunspot 1.3.0). Support Paperclip and S3 Storage

The code is based on the patch included here: outoftime.lighthouseapp.com/projects/20339/tickets/98-solr-cell

Requirements

Thanks to Chris Powell, there is now a cook-book style blog post for getting a Rails 3.2 app properly index rich documents using this gem: cbpowell.wordpress.com/2012/09/18/indexing-rich-documents-with-rails-sunspot-solr-sunspot-cell-and-carrierwave-cookbook-style/

  • Sunspot gem installed (>= 1.3.0)

  • Solr Cell libraries (dist/apache-solr-cell-1.4.X.jar and +contrib/extraction/lib/*.jar+ from the standard Solr distribution) placed in the /solr/lib directory as created by the Sunspot gem, in development environment. Your production setup might vary.

  • Adjustments to the Solr schema.xml:

<fieldType name=“ignored” stored=“false” indexed=“false” multiValued=“true” class=“solr.StrField” /> and <dynamicField name=“*_attachment” stored=“true” type=“text” multiValued=“true” indexed=“true”/> <dynamicField name=“ignored_*” type=“ignored”/>

Install Plugin

Add sunspot gem and sunspot_cell to Gemfile:

gem 'sunspot_rails', '~> 1.3.0'
gem 'sunspot_cell', :git => 'git://github.com/zheileman/sunspot_cell.git'

Usage

class Doc
  searchable do
     text :title
     attachment :file
   end
end

Paperclip & S3 Storage

require 'open-uri'

class Doc
  searchable do
     text :title
     attachment :attached_file
   end

private
  def attached_file
    URI.parse(remote_full_url)
  end
end