Cell support for Sunspot (packaged as gem), including this patch for changing the open file way: https://github.com/chebyte/sunspot_cell/commit/86dee3c113ee20069c7bcad17
Ruby
Pull request Compare This branch is 12 commits ahead, 6 commits behind springbok:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib
LICENSE
README.rdoc
Rakefile
VERSION
init.rb
sunspot_cell.gemspec Revert "Support for read_timeout (RSolr commit 38b5b69a0d)" Dec 19, 2011

README.rdoc

Sunspot Cell (gem)

Note by Zheileman

This gem adds Cell support (for indexing rich documents like pdf, docs, html, etc…) to Sunspot (developed against Sunspot 1.3.0). Support Paperclip and S3 Storage

The code is based on the patch included here: outoftime.lighthouseapp.com/projects/20339/tickets/98-solr-cell

Requirements

Thanks to Chris Powell, there is now a cook-book style blog post for getting a Rails 3.2 app properly index rich documents using this gem: cbpowell.wordpress.com/2012/09/18/indexing-rich-documents-with-rails-sunspot-solr-sunspot-cell-and-carrierwave-cookbook-style/

  • Sunspot gem installed (>= 1.3.0)

  • Solr Cell libraries (dist/apache-solr-cell-1.4.X.jar and +contrib/extraction/lib/*.jar+ from the standard Solr distribution) placed in the /solr/lib directory as created by the Sunspot gem, in development environment. Your production setup might vary.

  • Adjustments to the Solr schema.xml:

<fieldType name=“ignored” stored=“false” indexed=“false” multiValued=“true” class=“solr.StrField” /> and <dynamicField name=“*_attachment” stored=“true” type=“text” multiValued=“true” indexed=“true”/> <dynamicField name=“ignored_*” type=“ignored”/>

Install Plugin

Add sunspot gem and sunspot_cell to Gemfile:

gem 'sunspot_rails', '~> 1.3.0'
gem 'sunspot_cell', :git => 'git://github.com/zheileman/sunspot_cell.git'

Usage

class Doc
  searchable do
     text :title
     attachment :file
   end
end

Paperclip & S3 Storage

require 'open-uri'

class Doc
  searchable do
     text :title
     attachment :attached_file
   end

private
  def attached_file
    URI.parse(remote_full_url)
  end
end