<?xml version="1.0" encoding="UTF-8"?>
<commit>
  <added type="array"/>
  <modified type="array">
    <modified>
      <diff>@@ -4,155 +4,13 @@
 # Copyright (c) 2008 UK Citizens Online Democracy. All rights reserved.
 # Email: francis@mysociety.org; WWW: http://www.mysociety.org/
 #
-# $Id: acts_as_xapian.rb,v 1.19 2008/04/30 12:55:14 francis Exp $
-
-# TODO:
-# Test :eager_load
-# Test :if
-# Test reverse sorting
+# $Id: acts_as_xapian.rb,v 1.22 2008/05/16 12:29:39 francis Exp $
 
 # Documentation
 # =============
 #
-# Xapian is a full text search engine library, which has Ruby bindings.
-# acts_as_xapian adds support for it to Rails. It is an alternative to
-# acts_as_lucene or acts_as_ferret.
-#
-# Xapian is an *offline indexing* search library - only one process can have
-# the database open for writing at once, and others that try meanwhile are
-# unceremoniously kicked out. For this reason, acts_as_xapian does not support
-# automatic writing to the database when your models change.
-#
-# Instead, there is a ActsAsXapianJob model which stores which models need
-# updating or deleting in the search index. A rake task 'xapian:update_index'
-# then performs the updates since last change. Run it on a cron job, or
-# similar.
-#
-# Xapian 1.0.5 and associated Ruby bindings are required.
-#
-# Email francis@mysociety.org with patches.
-#
-#
-# Comparison to acts_as_solr (as on 24 April 2008)
-# ==========================
-#
-# * Offline indexing only mode - which is a minus if you want changes
-# immediately reflected in the search index, and a plus if you were going to
-# have to implement your own offline indexing anyway.
-#
-# * Collapsing - the equivalent of SQL's &quot;group by&quot;. You can specify a field
-# to collapse on, and only the most relevant result from each value of that
-# field is returned. Along with a count of how many there are in total.
-# acts_as_solr doesn't have this.
-#
-# * No highlighting - Xapian can't return you text highlighted with a search
-# query. You can try and make do with TextHelper::highlight (combined with
-# words_to_highlight below). I found the highlighting in acts_as_solr didn't
-# really understand the query anyway.
-#
-# * Date range searching - maybe this works in acts_as_solr, but I never found
-# out how.
-#
-# * Spelling correction - &quot;did you mean?&quot; built in and just works.
-#
-# * Multiple models - acts_as_xapian searches multiple models if you like,
-# returning them mixed up together by relevancy. This is like multi_solr_search,
-# only it is the default mode of operation and is properly supported.
-#
-# * No daemons - However, if you have more than one web server, you'll need to
-# work out how to use Xapian's remote backend http://xapian.org/docs/remote.html. 
-#
-# * One layer - full-powered Xapian is called directly from the Ruby, without
-# Solr getting in the way whenever you want to use a new feature from Lucene.
-#
-# * No Java - an advantage if you're more used to working in the rest of the
-# open source world. acts_as_xapian, it's pure Ruby and C++.
-#
-# * Xapian's awesome email list - the kids over at xapian-discuss are super
-# helpful. Useful if you need to extend and improve acts_as_xapian. The
-# Ruby bindings are mature and well maintained as part of Xapian.
-# http://lists.xapian.org/mailman/listinfo/xapian-discuss
-#
-#
-# Indexing
-# ========
-#
-# 1. Put acts_as_xapian in your models that need search indexing.
-#
-# e.g. acts_as_xapian :texts =&gt; [ :name, :short_name ],
-#        :values =&gt; [ [ :created_at, 0, &quot;created_at&quot;, :date ] ],
-#        :terms =&gt; [ [ :variety, 'V', &quot;variety&quot; ] ]
-#
-# Options must include:
-# :texts, an array of fields for indexing with full text search 
-#         e.g. :texts =&gt; [ :title, :body ]
-# :values, things which have a range of values for indexing, or for collapsing. 
-#         Specify an array quadruple of [ field, identifier, prefix, type ] where 
-#         - number is an arbitary numeric identifier for use in the Xapian database
-#         - prefix is the part to use in search queries that goes before the :
-#         - type can be any of :string, :number or :date
-#         e.g. :values =&gt; [ [ :created_at, 0, &quot;created_at&quot; ], [ :size, 1, &quot;size&quot;] ]
-# :terms, things which come after a : in search queries. Specify an array
-#         triple of [ field, char, prefix ] where 
-#         - char is an arbitary single upper case char used in the Xapian database
-#         - prefix is the part to use in search queries that goes before the :
-#         e.g. :terms =&gt; [ [ :variety, 'V', &quot;variety&quot; ] ]
-# A 'field' is a symbol referring to either an attribute or a function which
-# returns the text, date or number to index. Both 'number' and 'char' must be
-# the same for the same prefix in different models.
-#
-# Alternatively, 
-# :instead_index, a field which refers to another model that should be reindexed
-#          instead of this one.
-#
-# Options may include:
-# :eager_load, added as an :include clause when looking up search results in
-# database
-# :if, either an attribute or a function which if returns false means the
-# object isn't indexed
-#
-# 2. Make and run the migration to create the ActsAsXapianJob model, code below
-# (search for ActsAsXapianJob).
-#
-# 3. Call 'rake xapian::rebuild_index models=&quot;ModelName1 ModelName2&quot;' to build the index
-# the first time (you must specify all your indexed models). It's put in a
-# development/test/production dir in acts_as_xapian/xapiandbs.
-#
-# 4. Then from a cron job or a daemon, or by hand regularly!, call 'rake xapian:update_index'
-#
-#
-# Querying
-# ========
-#
-# If you just want to test indexing is working, you'll find this rake task
-# useful (it has more options, see lib/tasks/xapian.rake)
-#   rake xapian:query models=&quot;PublicBody User&quot; query=&quot;moo&quot;
-#
-# To perform a query call ActsAsXapian::Search.new. This takes in turn:
-#   model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent]
-#   query_string - Google like syntax, see below
-# And then a hash of options:
-#   :offset - Offset of first result
-#   :limit - Number of results per page
-#   :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance
-#   :sort_by_ascending - Default true, set to false for descending sort
-#   :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group)
-#
-# Google like query syntax is as described in http://www.xapian.org/docs/queryparser.html
-# Queries can include prefix:value parts, according to what you indexed in the
-# acts_as_xapian part above. You can also say things like model:InfoRequestEvent 
-# to constrain by model in more complex ways than the :model parameter, or
-# modelid:InfoRequestEvent-100 to only find one specific object.
-#
-# Returns an ActsAsXapian::Search object. Useful methods are:
-#   description - a techy one, to check how the query has been parsed
-#   matches_estimated - a guesstimate at the total number of hits
-#   spelling_correction - the corrected query string if there is a correction, otherwise nil
-#   results - an array of hashes containing:
-#       :model - your Rails model, this is what you most want!
-#       :weight - relevancy measure
-#       :percent - the weight as a %, 0 meaning the item did not match the query at all
-#       :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix
+# See ../README.txt for documentation. Please update that file as you
+# this code.
 
 require 'xapian'
 
@@ -383,7 +241,7 @@ module ActsAsXapian
         # date ranges or similar. Use this for cheap highlighting with
         # TextHelper::highlight, and excerpt.
         def words_to_highlight
-            query_nopunc = self.query_string.gsub(/[^a-z0-9:\.\/]/i, &quot; &quot;)
+            query_nopunc = self.query_string.gsub(/[^a-z0-9:\.\/_]/i, &quot; &quot;)
             query_nopunc = query_nopunc.gsub(/\s+/, &quot; &quot;)
             words = query_nopunc.split(&quot; &quot;)
             # Remove anything with a :, . or / in it
@@ -432,22 +290,7 @@ module ActsAsXapian
     ######################################################################
     # Index
    
-    # Offline indexing job queue model, create with this migration:
-    #    class ActsAsXapianMigration &lt; ActiveRecord::Migration
-    #        def self.up
-    #           create_table :acts_as_xapian_jobs do |t|
-    #                t.column :model, :string, :null =&gt; false
-    #                t.column :model_id, :integer, :null =&gt; false
-    #
-    #                t.column :action, :string, :null =&gt; false
-    #            end
-    #            add_index :acts_as_xapian_jobs, [:model, :model_id], :unique =&gt; true
-    #        end
-    #
-    #        def self.down
-    #            remove_table :acts_as_xapian_jobs
-    #        end
-    #    end
+    # Offline indexing job queue model, create with migration in ../README.txt
     class ActsAsXapianJob &lt; ActiveRecord::Base
     end
 </diff>
      <filename>lib/acts_as_xapian.rb</filename>
    </modified>
  </modified>
  <removed type="array"/>
  <parents type="array">
    <parent>
      <id>52307e3a15c1adfd9429cdc2e07a397360e38b4a</id>
    </parent>
  </parents>
  <author>
    <name>Francis Irving</name>
    <email>francis@cat.(none)</email>
  </author>
  <url>http://github.com/frabcus/acts_as_xapian/commit/f29f23658c4eb2c33583d14db59afedc2890639d</url>
  <id>f29f23658c4eb2c33583d14db59afedc2890639d</id>
  <committed-date>2008-05-16T07:05:48-07:00</committed-date>
  <authored-date>2008-05-16T07:05:48-07:00</authored-date>
  <message>Documentation moving.
Adding _ to highlight</message>
  <tree>3b82ed5e934bfef7ead7718a0de20620e0eb1e91</tree>
  <committer>
    <name>Francis Irving</name>
    <email>francis@cat.(none)</email>
  </committer>
</commit>
