Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Index Tank Integration with your fav ORM

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 autotest
Octocat-spinner-32 lib
Octocat-spinner-32 spec
Octocat-spinner-32 .document
Octocat-spinner-32 .gitignore
Octocat-spinner-32 Gemfile
Octocat-spinner-32 LICENSE
Octocat-spinner-32 README.rdoc
Octocat-spinner-32 Rakefile
Octocat-spinner-32 VERSION
Octocat-spinner-32 tanker.gemspec
README.rdoc

Tanker - IndexTank integration to your favorite ORM

IndexTank is a great search indexing service and this gem lets any Ruby ORM stay in sync with IndexTank easily. Tanker provides a clean API for defining what data you want indexed and for querying that data. Query results natively support pagination via will_paginate/collection.

Requirements

As long as your Ruby objects have an 'id' method Tanker will index them.

Installation

gem install tanker

If you are using Rails 3 its better to add Tanker to your GEMFILE

gem 'tanker'

And run

bundle install

Initialization

If you're using Rails, config/initializers/tanker.rb is a good place for this:

YourAppName::Application.config.index_tank_url = 'http://:xxxxxxxxx@xxxxx.api.indextank.com'

If you are not using rails you can put this somewhere before you load your modesl

Tanker.configuration = {:url => 'http://:xxxxxxxxx@xxxxx.api.indextank.com' }

You would probably want to have fancier configuration depending on your environment. Be sure to copy and paste the correct url provided by the IndexTank Dashboard

Example

class Topic < ActiveRecord::Base
  acts_as_taggable
  has_many :authors

  # just include the Tanker module
  include Tanker

  # define the index by supplying the index name and the fields to index
  # this is the index name you create in the Index Tank dashboard
  # you can use the same index for various models Tanker can handle
  # indexing searching on different models with a single Index Tank index
  tankit 'my_index' do
    indexes :title
    indexes :content
    indexes :id
    indexes :tag_list #NOTICE this is an array of Tags! Awesome!

    # you may also dynamically retrieve field data
    indexes :author_names do
      authors.map {|author| author.name }
    end

    # to pass in a list of variables with your document,
    # supply a hash with the variable integers as keys:
    variables do
      {
        0 => authors.first.id, # these will be available in your queries
        1 => id                # as doc.var[0] and doc.var[1]
      }
    end

    # You may defined some server-side functions that can be
    # referenced in your queries. They're always referenced by
    # their integer key:
    functions do
      {
        1 => '-age',
        2 => 'doc.var[0] - doc.var[1]'
      }
    end
  end

  # define the callbacks to update or delete the index
  # these methods can be called whenever or wherever
  # this varies between ORMs
  after_save :update_tank_indexes
  after_destroy :delete_tank_indexes

  # Note! Will Paginate pagination, thanks mislav!
  def self.per_page
    5
  end

end

Create Topics

Topic.create(:title => 'Awesome Topic', :content => 'blah, blah', :tag_list => 'tag1, tag2')
Topic.create(:title => 'Bad Topic', :content => 'blah, blah', :tag_list => 'tag1')
Topic.create(:title => 'Cool Topic', :content => 'blah, blah', :tag_list => 'tag3, tag2')

Search Topics

@topics = Topic.search_tank('Topic', :page => 1, :per_page => 10) # Gets 3 results!
@topics = Topic.search_tank('blah',  :conditions => {:tag_list => 'tag1'}) # Gets 2 results!
@topics = Topic.search_tank('blah',  :conditions => {:title => 'Awesome', :tag_list => 'tag1'}) # Gets 1 result!

Search multiple models

@results = Tanker.search([Topic, Post], 'blah') # Gets any Topic OR Post results that match!

Search with negative conditions

@topics = Topic.search_tank('keyword',  :conditions => {'tag_list' => 'tag1'}) # Gets 1 result!
@topics = Topic.search_tank('keyword',  :conditions => {'-id' => [5,9]}) # Ignores documents with ids of '5' or '9'

Search with query variables

@topics = Topic.search_tank('keyword',  :variables => {0 => 5.6}) # Makes 5.6 available server-side as q[0]

Paginate Results

<% will_paginate(@topics) %>

Geospatial Search Example

IndexTank and the Tanker gem support native geographic calculations. All you need is to define two document variables with your latitude and your longitude

class Restaurant < ActiveRecord::Base
  include Tanker

  tankit 'my_index' do
    indexes :name

    # You may define lat/lng at any variable number you like, as long
    # as you refer to them consistently with the same integers
    variables do
      {
        0 => lat
        1 => lng
      }
    end

    functions do
      {
        # This function is good for sorting your results. It will
        # rank relevant, close results higher and irrelevant, distant results lower
        1 => "relevance / miles(d[0], d[1], q[0], q[1])",

        # This function is useful for limiting the results that your query returns.
        # It just calculates the distance of each document in miles, which you can use
        # in your query. Notice that we're using 0 and 1 consistently as 'lat' and 'lng'
        2 => "miles(d[0], d[1], q[0], q[1])"
      }
    end
  end
end

# When searching, you need to provide a latitude and longitude in the query so that
# these variables are accessible on the server.
Restaurant.search("tasty",
                  :var0 => 47.689948,
                  :var1 => -122.363684,
                  # And we'll return all results sorted by distance
                  :function => 1)

# Or we can use just function 2 to return results in the default order, while
# limiting the total results to just those within 50 miles from our lat/lng:
Restaurant.search("tasty",
                  :var0 => 47.689948,
                  :var1 => -122.363684,
                  # the following is a fancy way of saying "return all results where
                  the result of function 2 is at least anything and less than 50"
                  :filter_functions => {2 => [['*', 50]]})
# And we can combine these two function calls to return only results within 50 miles
# and sort them by relevance / distance:
Restaurant.search("tasty",
                  :var0 => 47.689948,
                  :var1 => -122.363684,
                  :function => 1,
                  :filter_functions => {2 => [['*', 50]]})

Categories and faceted search example

If your IndexTank indexes have faceting enabled, you can do this:

class Part < ActiveRecord::Base
  include Tanker

  tankit 'index' do
    indexes :brand
    indexes :part_number

    # return a hash of category/value pairs
    categories do
      {
        'price_range' => (price > 5 ? 'expensive' : 'cheap')
      }
    end
  end
end

To retrieve facets in a search:

@parts, @facets = Part.search_tank('widget', :facets => true)

# @facets =
#   { 'price_range' => { 'expensive' => 93, 'cheap' => 47 } }

At some point you will probably also want to filter what categories to search. That's faceted searching, after all!

# search only cheap parts:
@parts = Part.search_tank('widget', :category_filters => { 'price_range' => 'cheap' })

# highly contrived, but to show how you could filter multiple values for a
# given category:
@parts = Part.search_tank('widget', :category_filters => { 'price_range' => ['cheap', 'expensive'] })

# or multiple categories:
@parts = Part.search_tank('widget', :category_filters => { 'price_range' => ['cheap', 'expensive'],
                                                           'type' => 'Part' })

Extend your index definitions

If you have a bunch of models with a lot of overlapping indexed fields, variables, or categories, you might want to abstract those out into a module that you can include and then extend in the including classes. Something like:

module TankerDefaults
  def self.included(base)
    base.send(:include, ::Tanker) # include the actual Tanker module

    # provide a default index name
    base.tankit 'my_index' do
      # index some common fields
      indexes :tag_list

      # set some common variables
      variables do
        {
          0 => view_count
          1 => foo
        }
      end

      # set some common categories
      categories do
        {
          "type" => self.class.name
        }
      end
    end
  end
end

class SuperModel
  include TankerDefaults

  # no need to respecify the index if it's the same
  # (but you can override it)
  tankit do
    # `indexes :tag_list` is inherited
    indexes :name
    indexes :boyfriend_names do
      boyfriends.map(&:name)
    end

    variables do
      {
        # `0 => view_count` is inherited
        1 => iq,                 # overwrites "foo"
        2 => endorsements.count  # adds new variables
      }
    end

    # and the same can be done with categories
  end
end

You currently can't remove previously defined stuff, though.

An index for each environment

If you want to use different indexes for development and production you can either specify the index name explicitly:

tankit "myapp_#{ENV['RACK_ENV']}" do...

Or you can let Tanker guess the name of the index for you:

tankit do ...

If you're on Rails Tanker will assume the index should be named “app_name_development” or “app_name_production”

Reindex your data

rake tanker:clear_indexes

This task deletes all your indexes and recreates empty indexes, if you have not created the indexes in the Index Tank interface this task creates them for you provided you have enough available indexes in your account.

rake tanker:reindex

This task pushes any functions you've defined in your tankit() blocks to indextank.com

rake tanker:functions

This task re-indexes all the your models that have the Tanker module included. Usually you will only need a single Index Tank index but if you want to separate your indexes please use different index names in the tankit method call in your models

To reindex just one class rather than all the classes in your application just call:

Model.tanker_reindex

And if you'd like to specify a custom number of records to PUT with each call to indextank.com you may specify it as the first argument:

Model.tanker_reindex(:batch_size => 1000) # defaults to batch size of 200

Or you can restrict indexing to a subset of records. This is particularly helpful for massive datasets where only a small portion of the records is useful.

Model.tanker_reindex(:batch_size => 1000, :scope => :not_deleted) # will call Model.not_deleted.all internally

TODO

  • Non-rails integration (like implicit index naming)

  • allowing functions to be named by strings in the app, only converted to integers when talking to IndexTank

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Acknowledgements

The gem is based on a couple of projects:

*redis-textsearch *mongomapper *thinkingtank *will_paginate

Kudos to this awesome projects and developers

Copyright

Copyright © 2010 @kidpollo. See LICENSE for details.

Something went wrong with that request. Please try again.