Fetching contributors…
Cannot retrieve contributors at this time
954 lines (576 sloc) 52.4 KB

Discontinued, please see the

Upcoming Version

Version 4.6.1 “small change”

  • hanke: Many small internal improvements.

Version 4.6.0 “better being prepared than prepared prepared”

  • hanke: Important Note: This release changes the file location of the prepared indexes! If you rely on this not changing, you need to adapt your scripts!
  • hanke: (server) Prepared files had a double “prepared” and an unnecessary “index” in the file name. They do not anymore. For example, prepared_keywords_index.prepared.txt has changed to keywords.prepared.txt, the tokenized index for the keywords category.

Version 4.5.12 “counts count more than weights”

  • hanke: (server) Facets API now returns counts rather than weights.
  • hanke: (server) Facets API changed option more_than into at_least – give if you need facets with at least a certain count.
  • hanke: (server) Facets API added option counts (true/false) – facets methods will return a hash with counts if true or not given, ie. nil, and an array if false.

Version 4.5.11 “partial to facets”

  • hanke: (server) Search#facets now does access the partial index, but always the exact index.

Version 4.5.10 “fastcets”

  • hanke: (server) Search#facets performance improved.

Version 4.5.8 / 4.5.9 “ruby and its many facets”

  • hanke: (server) Experimental simple facets support.
  • hanke: (server) Added Index#facets(:category_name, options = {}) with options: more_than (a minimum weight a facet needs to have to be included). Will return keys and weights.
  • hanke: (server) Added Search#facets(:category_name, options = {}) with options: filter (a query to filter with, e.g. 'brand:mammut'), and more_than (a minimum weight, see above).
  • hanke: Note – if your data is very dirty (ie. many facets that occur only once./), consider using a minimum to speed up the facets query!
  • hanke: Usage – products.facets :brand_name, filter: 'category:boots', more_than: 0 (will return all brand_name facets filtered by 'category:boots' that have more weight than 0).

Version 4.5.7 “bel hevy”

  • hanke: (server) Added category option weight. The weight option now takes a number and adds that to the default logarithmic weighing. E.g. weight: +6 (very strong positive weight) or weight: -0.5 (slightly negative weight). This results in a higher/lower score.

Version 4.5.5 / 4.5.6 “swish swish clang clang”

  • hanke: (server) Clang can now compile Picky.
  • hanke: (server) Much better error message in case Picky can’t be compiled.
  • hanke: (server) Clean compilation on gem install.

Version 4.5.4 “like stripes on a car”

  • hanke: (server) Removed code, making Picky approximately 10% faster.

Version 4.5.3 “hero koo koo”

  • hanke: (server) Check for the existence of RbConfig before compiling.

Version 4.5.2 “statistically probable”

  • hanke: (statistics) New experimental statistics interface. Run picky stats to get the usage.

Version 4.5.1 “JSON, a nightmare”

  • hanke: (server) Fix for multi_json gem usage.

Version 4.5.0 “JSON, a nightmare on Elm street”

  • hanke: (server) Picky now uses the multi_json gem.

Version 4.4.2 “not carbon copy”

  • hanke: (server) Implements a suggestion by David Lowenfels which enables a Picky user to set the CC environment variable to define which C compiler is used.

Version 4.4.1 “even more unique”

  • hanke: (server) Fix for bug introduced in 4.4.0, unique option works correctly with offset.

Version 4.4.0 “unique like a snowflake”

Unique option on search instance. This will remove each result id in allocations if they have appeared in preceding allocations. What does this mean?

You search for "picky search". And you find it in two allocations, name, type and name, name. Let’s say Picky finds ids [1, 2, 3] in name, type and then [2, 3, 4] in name, name. Picky will then remove 2 and 3 from name, name because they have been found in name, type already.

Usually this is used when you only want a list of unique ids in the results.

  • hanke: (server) Added unique: truey/falsy option on Search#search. Use like this: 'query', 20, 0, unique: true.

Version 4.3.2 “show of character”

This version lets you define control characters on tokens, like so (shows how, and the default):

  • hanke: (server) Picky::Query::Token.partial_character = '\*' for searching partially.
  • hanke: (server) Picky::Query::Token.no_partial_character = '"' for not searching partially.
  • hanke: (server) Picky::Query::Token.similar_character = '~' for searching similar strings.
  • hanke: (server) Picky::Query::Token.no_similar_character = '"' for not searching similar strings.
  • hanke: (server) Picky::Query::Token.qualifier_text_delimiter = ':' for telling qualifier and string apart (title:sometitle).
  • hanke: (server) Picky::Query::Token.qualifiers_delimiter = ',' for telling qualifiers apart (title,author:bla).

The first four are going to be interpolated into %r, so escape the character like you would in a regexp. The last two are used in String#split, so doing this is not necessary.

So, for example, if you set

Picky::Query::Token.partial_character = '…'

Picky::Query::Token.qualifier_text_delimiter = '?'

Picky::Query::Token.qualifiers_delimiter = '|'

Then you can search like so:"title|author?wittgenstei…")

Version 4.3.1 “status anxiety”

  • hanke: (server) Sinatra index actions now return more sensible HTTP status codes.

Version 4.3.0 “evening wear”

  • hanke: (client) Gorgeous new design (thanks @tvandervossen!)
  • hanke: (client) Completely overhauled Picky JavaScript.

Version 4.2.4 “fileutils”

  • hanke: (server) Require fileutils regardless of the Ruby version Picky is run in.

Version 4.2.3 “overbird”

  • hanke: (server) Require fileutils in case we run Picky on MacRuby (thanks overbryd).

Version 4.2.2 “cheese royale”

  • hanke: (server) Use the “standard” way to detect the Ruby engine used for MacRuby.

Version 4.2.1 “fries with that”

  • hanke: (server) Experimental extensions to get Picky run on MacRuby 0.12.
  • hanke: (server) Unfortunately, we dom’t have the resources to always run the tests – please use with caution.

Version 4.2.0 “talk to the hand”

  • hanke: (server) Redesigned how Picky logs: Picky itself logs its index handling (tokenizing/dumping/loading) using one of its built-in loggers. Set a logger after requiring ‘picky’ like this: Picky.logger = # or any IO. Default is aka Picky::Loggers::Default. Also an option is Picky::Loggers::Silent. This closes issue 70.
  • hanke: (server) Note: Logging searches is your job (see generated examples on how to do this).

Version 4.1.0 “identification needed”

  • hanke: (server) Picky::Results#ids(only = nil) returns the amount of ids originally requested, except if an only amount gets passed in (then that amount is used).

Version 4.0.9 “new york new york”

  • hanke: (client) Picky::Client::ActiveRecord.configure(options) added as an alias of new (thanks auastro!).
  • hanke: (server) require 'picky/sinatra/index_actions' is not necessary anymore to load the index actions. They are required automatically with require 'picky/sinatra'.

Version 4.0.8 “smoke and mirrors”

  • hanke: (client/server) Encode data part in JSON.

Version 4.0.7 “supermodel”

  • hanke: Experimental ActiveRecord 3.0+ integration release. See below.
  • hanke: (client) ActiveRecord models can now use extend*attributes_to_send, options = {}) to have the model send updates/deletes back to the Picky server. Note that error handling is not yet built in. The server needs to be up and running.
  • hanke: (server) The Sinatra style server can now extend Picky::Sinatra::IndexActions to install index updating POST/DELETE methods on the “/” path (Note: Currently needs a require 'picky/sinatra/index_actions' beforehand).

Version 4.0.6 “bug #57”

  • hanke: (client) Fixed bug #57, multicategory selections in the Javascript user interface.

Version 4.0.5 “only :you”

  • hanke: (server) Experimental release of only option for searches. Does the same as only:that_category, but implicitly, in the search. E.g. only :cat1, :cat2.

Version 4.0.4 “opinionated environment”

  • hanke: (server) Default amount of similar tokens is now set to 3 instead of 10 for phonetic similarities.
  • hanke: (server) Server uses PICKY_ENV environment variable before RUBY_ENV and then RACK_ENV.

Version 4.0.2/3 “mea culpa”

  • hanke: (server) Fix for realtime indexing when using specific options.

Version 4.0.1 “unauthorized”

  • hanke: (server) Customized weight and similarity do not need the saved? method anymore.

Version 4.0.0 “singing in the rain”

  • hanke: No changes from 4.0.0pre7.

Version 4.0.0pre7

  • hanke: (server) BREAKING The tokenizer option for a category has been renamed to indexing, to conform with the methods for the index and the sinatra app.
  • hanke: (server) BREAKING Internal Similarity#encoded method has been renamed to #encode.

Version 4.0.0pre6

  • hanke: (server) Similarity API fixed.

Version 4.0.0pre5

  • hanke: (statistics) Only use 0.01s for checking the log file instead of 0.1.

Version 4.0.0pre4

  • hanke: (statistics) Overhauled statistics interface. Use picky statistics log/search.log to start it.

Version 4.0.0pre3

  • hanke: (server) BREAKING Reverting customizeable backends from version 3.3.2. They are no longer available. Please use simple subclassing to achieve funky backends.
  • hanke: (server) BREAKING SQLite self_indexed and Redis immediate option is now called realtime, as changes go directly through to the actual backends, in “realtime”.
  • hanke: (server) The Index#source block is now evaluated every time an indexer runs.

Version 4.0.0pre2

  • hanke: (server) BREAKING Removed Picky classic application. Please use Picky e.g. in a Sinatra app.
  • hanke: (server) BREAKING Removed Picky classic sources. Please use a source with the #each method.
  • hanke: (server) BREAKING Option weights for the Picky::Index#category method has been renamed weight to conform with the other methods.
  • hanke: (server) BREAKING Picky does not require the text gem anymore by default. Only when you use phonetic similarity. It will tell you what it needs.
  • hanke: (server) BREAKING Added the PICKY_ENVIRONMENT in front of the Redis key namespace to differentiate the various environments.
  • hanke: (server) BREAKING Removed rake routes since only the classic server was able to provide it.
  • hanke: (server) BREAKING Removed the classic server from the generators.
  • hanke: (server) Explicitly uses Yajl::Encoder#encode for JSON encoding.
  • hanke: (server) Fixed cases where even when no similarity was defined on a category, similar results were still found.
  • hanke: (server) Rake task index now points to task index:parallel by default. Call rake:serial to index serially.
  • hanke: (server) Indexer calls reconnect! on sources that support it.
  • hanke: (server) Location/Volumetric/Geosearch rewritten.

Version 4.0.0pre1

  • hanke: (server) BREAKING Picky::Indexes.index does not index in parallel anymore.
  • hanke: (server) BREAKING Renamed Picky::Indexes.index_for_tests to Picky::Indexes.index.
  • hanke: (server) If you want to explicitly run parallel indexing programmatically, use Picky::Indexes.index true) or Picky::Indexes[:index_name].index true).
  • hanke: (server) BREAKING Renamed Picky::Wrappers::Category::ExactFirst to Picky::Results::ExactFirst. Extend instead of wrap: index.extend Results::ExactFirst or category.extend Results::ExactFirst. If an index is extended, each category of the index will be extended.
  • hanke: (server) BREAKING Picky::Indexes.reload has been renamed to Picky::Indexes.load.
  • hanke: (server) BREAKING index.reload has been renamed to index.load.
  • hanke: (server) BREAKING category.reload has been renamed to category.load.
  • hanke: (server) BREAKING Removed all define_... methods on indexes.
  • hanke: (generators) Fixed integration specs for the generated “all in one” server/client.
  • hanke: (generators) Changed method calls to adapt to above changes.
  • hanke: (server) Using the procrastinate gem to parallelize indexing.
  • hanke: (server) Indexing call structure cleaned up. Improves performance by about 40%.

Version 3.6.16

  • hanke: (server) Semantics for terminate_early(n) are to calculate n more allocations than necessary. A n of 0 means that only exactly the number of necessary allocations for the ids is calculated.

Version 3.6.14/15

  • hanke: (server) Fix for terminate_early with offsets in 3.6.12 (thanks niko!).

Version 3.6.13

  • hanke: (server) Fix for exact first matching (thanks geelen!).

Version 3.6.12

  • hanke: (server) Picky::Search option terminate_early(integer) or terminate_early(with_extra_allocations: integer) introduces early termination. If in your interface you only need the ids and no total, then this is the option for you. Calling terminate_early without parameters will use 0 as the default.
  • hanke: (server) Fix for exact first matching (thanks geelen!).

Version 3.6.11

  • hanke: (server) Fix for bad performance bug introduced somewhere in 2.4.
  • hanke: (server) Backends rewritten to support realtime indexes (SQLite, Redis). Memory already supported it (needs call to Index#build_realtime_mapping after loading if dumped+loaded). File backend will not support realtime index in the near future.
  • hanke: (server) Experimental, use at your own peril: Method to build the realtime index, explicitly: Index#build_realtime_mapping.

Version 3.6.10

  • hanke: (generators/server) script/console command minified in the generation and moved to the server.

Version 3.6.9

Version 3.6.8

  • hanke: (server) BREAKING Renamed the undocumented Tokenizer#maximum_tokens(integer) to Tokenizer#max_words(integer). Restricts the amount of words that the tokenizer lets through to the core search engine.
  • hanke: (server) Added Search#max_allocations(integer) to restrict number of allocations that are actually calculated (to avoid combinatorial and UI explosions).
  • hanke: (server) Added << and unshift on Index and Category. The unshift method behaves like the add method when that one is called without a second parameter. Use like index <<, 'some text', 'some other text').
  • hanke: (server) Existence of a source is only checked when really needed. Will fail hard if there is none, with a (hopefully) useful error message.

Version 3.6.7

  • hanke: (server) Experimental #build_realtime_mapping method to rebuild the realtime mapping helper after a dump/load.

Version 3.6.6

  • hanke: (server) Fix and regression spec for a Redis backend bug introduced in 3.6.5.

Version 3.6.5

  • hanke: (server) Exact-first wrapper for experimental purposes.

Version 3.6.4

  • hanke: (server) Removed active record, redis, mysql dependencies from picky.gemspec.

Version 3.6.3

  • hanke: (server) From Redis 2.6.0 on, Picky will be around 65% faster with Redis as a backend.

Version 3.6.2

  • hanke: (client) Fixed Javascript. See
  • hanke: (server) Weights now only saved up to the third position after the decimal point.
  • hanke: (server) SQLite backend has been renamed from Sqlite to SQLite.
  • hanke: (server) Backends can be switched dynamically (use index.backend = new_backend). Used for performance tests.

Version 3.6.1

  • hanke: (server) Removed sqlite3 from gemspec to enable Heroku compatibility. Please add it in your Gemfile if you need it or simply install the gem separately.

Version 3.6.0

This release includes BREAKING changes. See below.

  • hanke: This version tries to reduce maintenance complexity and prepare for 4.0.
  • hanke: (server) BREAKING In your code, rename any occurrences of Indexes.reload, Indexes#reload, Index#reload, Category#reload with an equivalent load method.
  • hanke: (server) Renamed load_from_cache with load on Indexes, Index, Category.
  • hanke: (server) Removed rake check and related methods with no replacement. Please tell us if you miss it.
  • hanke: (server) Removed Index#backup, Index#restore and related methods on Category etc. with no replacements. Please tell us if you miss them.
  • hanke: (server) Fix for the problem that #remove(id) didn’t remove when a different key_format than the standard one was defined (Thanks niko!).

Version 3.5.4

  • hanke: (server) Fix for using Rack::Harakiri in an example project. (Ok, time for bed)

Version 3.5.3

  • hanke: (server) Fix for using dynamic weights and then deleting something from it.

Version 3.5.2

  • hanke: (server) Changed the way the internal backend is dumped to json or marshalled.

Version 3.5.1

  • hanke: (server) generate_from methods have been removed from all generators as they are not used anymore.
  • hanke: (server) Added the option of having dynamic weights calculation. Use this if you don’t need weights based on the amount of indexed ids per token. This does not generate an index in the backend (Redis or file), but calculates the weight at runtime. Examples: Always return the default 0.0, category :text, weights: or always return 3.14, category :text1, weights: or calculate a weight at runtime, based on the size of the str_or_sym we are looking for, category :text1, weights: { |str_or_sym| str_or_sym.size }. We recommend using search boosts to boost specific category combinations.

Version 3.5.0

  • hanke: (server) Internally, tokens are held as strings. This helps dealing with memory issues when using realtime indexes. This might make Picky’s memory usage a bit higher that before. However, when using realtime indexes, the memory usage will be much improved.
  • hanke: (server) Complete internal rewrite of how indexing is handled.

Version 3.4.3

  • hanke: (server) Performance fix for problem introduced in 3.4.3.

Version 3.4.2

  • hanke: (server) Fixed a bug where ids occurred multiple times for an indexed token in the same index bundle (thanks M. Below for finding the bug). This did not impact on the search results, just the stored index files.

Version 3.4.1

  • hanke: (server) Intermittent service release to test internal String-based indexes.

Version 3.4.0

  • hanke: (client) Method populate_with keeps the ids by default. Use clear_ids on the results if you want to remove them.

Version 3.3.3

Version 3.3.2

  • hanke: (server) Internal interface for generators changed. The generators are now used directly, e.g.: 1).generate_from inverted_index_hash. No change on your part is necessary if you didn’t use Picky::Generators::{Partial,Weights,Similarity}Generator.
  • hanke: (server) Experimental exchangeable backend change: Redis now passes bundle, client into the lambda, instead of client, bundle. E.g. inverted: ->(bundle, client) {, "#{bundle.identifier}:inverted") }

Version 3.3.1

  • hanke: (server) Fix for Partial::None, introduced in 3.3.0.

Version 3.3.0

  • hanke: (server) ActiveRecord is not loaded anymore by default, as only few users use the Picky db source (if you do, Picky will try to require it and tell you if it can’t).
  • hanke: (server) It is now possible to explicitly dump an index, using index.dump. This is useful with realtime indexes.
  • hanke: (server) Added a new partial option, Postfix, with an option, from. With from: -4 and a word like octopus, will generate partials [:octopus, :octopu, :octop, :octo] (until -4).
    New default option is -3), not -3, to: -1) anymore. The two options are identical in function.
  • hanke: (server) Only Picky’s tokenizers call to_s on data anymore. This means that you can write tokenizers that work on whatever kind of object you like. The Picky standard tokenizers themselves ensure that they get to work with a string.
  • hanke: (server) Fix for Substring partialization, when negative from and to options are used at the same time.
  • hanke: (server) Experimental exchangeable backends.
  • hanke: (project) RSpec 1 has been updated to RSpec 2.

Version 3.2.0

This release includes BREAKING changes. See below.

  • hanke: (server) Removed bundler specific code from Picky. You can now decide yourself if you want it. Opens the possibility to just run Picky in a script to try ideas etc. (see example gist:
  • hanke: (generators) The generated Sinatra server does not use bundler anymore. Classic servers (might) still need it. You can add it back in by adding the following code in app.rb, right after require 'picky':
  require 'bundler'
rescue LoadError => e
  require 'rubygems'
  require 'bundler'

Version 3.1.13

  • hanke: (generators) picky generate will not display the error backtrace part anymore.

Version 3.1.12

  • hanke: (server) Runtime indexing (remove, add, replace) now possible on a single category. Please use e.g. index[:category_name].add some_object_with_id_and_category_name_method.

Version 3.1.11

  • hanke: (server) See last release. This release adds support for similarity searches on a realtime index.
  • Please only use realtime indexing for experimental purposes.

Version 3.1.10

  • hanke: (server) This release holds an experimental release of realtime indexing for 3.2: An index now supports #add(object_responding_to_id_and_categories), #remove(id_of_added_object), #replace(object_responding_to_id_and_categories). Replace is simply remove+add. Replacing a non-existent object behaves like an add. I suggest using solely replace. Notes: Only works in single-process, single-threaded servers. Does not persist. Only yet works when starting from an empty index, e.g. source [].
  • Please only use realtime indexing for experimental purposes.

Version 3.1.9

  • hanke: (server) Rewrite of “rake index” – Picky will only fork processes if there is the capability to fork (i.e. not Windows), or if there are more than one processor available.

Version 3.1.8

Version 3.1.7

Version 3.1.6

Version 3.1.5

  • hanke: (server) New Search block option: ignore_unassigned_tokens(truey/falsy). Default is false. If true, will ignore tokens that cannot be assigned to any category. If you search for example for "Picky Garblegarblegarble", and "Garblegarblegarble" isn’t in any index, then it will return result as if "Garblegarblegarble" hadn’t been there. In this case, it will just return something like searchengine:“picky”.

Version 3.1.4

  • hanke: (server) Don’t fork if there’s just one index to be processed.

Version 3.1.3

  • hanke: (server) Added #ignore option to Search definition block. Calling ignore :name will ignore tokens in allocations that are mapped to the name category. Example: You search for “David Hasselhoff”. If Picky maps this to allocations [ [:first_name, name], [:first_name, :movie_title] ], only [ [:first_name], [:first_name, :movie_title] ] will survive. The Hasselhoff - name match will simply be ignored.

Version 3.1.2

  • hanke: (generated client) The before Javascript callback option given to the PickyClient has changed signature and how it is called. Old was before(params, query), and the returned params changed the params. This did not allow changing the query in the callback. New is before(query, params) and the returned query replaces the query given as parameter. This allows changing the query before sending it off. The params can be changed as well, using params['option'] = value;.

Version 3.1.1

  • hanke: (server) rake index does not fork anymore if there’s just one index to be indexed.
  • hanke: (server) Experimental Picky::Partial::Infix partial generator. Use to find all possible substrings inside words. Options are min, max, both take negative and/or positive values. Negative values indicate length up to length – X. E.g. min: 3, max: -1 # :hello => [:hello, :hell, :ello, :hel, :ell, :llo]
  • hanke: (server) Experimental Picky::Backends::File file backend. Use in index definition block as follows: backend Use if you don’t want Picky to use as much memory. Performance penalty applies.

Version 3.1.0

This release includes BREAKING changes. See below.

  • hanke: (server) Exchangeable backends. New index definition: Indexes::Memory and Indexes::Redis are now unified in Index. So use index = from now on. (See next point)
  • hanke: (server) A new option has been added to the index, backend. It takes a backend instance, making the backend exchangeable. The default is the memory backend, which you do not need to set. If you want a Redis backend, use as follows: index = { backend }. If you want to explicitly set the memory backend: index = { backend }.
  • hanke: (server) Unified tokenizers. Method #tokenize(text) now returns [ ["token", "token", "token"], ["Original", "Original", "Original"] ]. So your own tokenizer only needs to adhere to this interface and can be passed to the index/search using the indexing/searching method.
  • hanke: (server) Removed tokenizer option removes_characters_after_splitting: /some regexp/ (without replacement).

Version 3.0.1

  • hanke: (server) Fixed & integration tested rake tasks (Thanks rogerbraun!)

Version 3.0.0

This release includes BREAKING changes. See below. (Here we start with this style of BREAKING notation)

  • hanke: (client) BREAKING Removed method Picky::Convenience#allocations_size. Use #allocations.size.
  • hanke: (server) BREAKING Removed Results#to_log. Results#to_s returns a log worthy string now.
  • See changes in pre versions for complete changelog on 3.0.

Version 3.0.0.pre5

  • hanke: (server) Renamed Picky::Result#serialize → Picky::Result#to_hash.

Version 3.0.0.pre4

  • hanke: (generators) Added an All-In-One (Client + Server) Sinatra web app. This proves useful when wishing to use Picky on Heroku.

Version 3.0.0.pre3

  • hanke: (client) Gemfile referred to version ~> 2.0 instead of = 3.0.0.pre2.

Version 3.0.0.pre2

  • hanke: (server) Breaking: Index::Memory and Index::Redis do not accept options anymore.
Define options in the block or on the resulting instances some_index = do source … key_format … category … category … category … result_identifier … end
  • hanke: (server) Breaking: PickyLog removed.
In the classic server, use Picky.logger = ‘log/search.log’ if you want to log (uses SomeLogger#info). In the Sinatra server, use MyLogger = ‘log/search.log’ … get ‘/path’ do result = … result.to_log(params[:query]) if you want to log. result.to_json end
  • hanke: (server) Breaking: app/logging.rb not loaded anymore. You have to require it yourself if you want that.
  • hanke: (server) A missing source is only noticed when it is used (such as in indexing). This makes it possible to set a source at a later time.

Version 3.0.0.pre1

  • hanke: (server) Note: The key_format is not saved in the index configuration anymore.
  • hanke: (generator) New example server, sinatra_server. The new default, very flexible.

Version 2.7.0

  • hanke: (server) Breaking: Method #take_snapshot removed from Indexes/Index/Category (not needed anymore).
  • hanke: (server) Breaking: Users need to reindex when installing this version (index “index” now identified by “inverted” to be more clear).
  • hanke: (server) Rake tasks rewritten to be simpler and clearer. Most notably, index:specific[index,category] is now just index[index,category] (both optional).
  • hanke: (server) Reindexing now possible in running server, also for ActiveRecord Arel sources.
  • hanke: (server) More verbose indexing output with file locations.
  • hanke: (server) Taking data snapshots improved.
  • hanke: (client) Fix for e.g. picky search localhost:8080/books if highline gem is missing (thanks tonini!).

Version 2.6.0

  • hanke: (server) Breaking: Indexes#find method has been removed. Use Indexes[index_name] and Indexes[index_name][category_name].
  • hanke: (server) Breaking: Index#index!, Index#cache!, Category#index!, Category#cache! have been removed. Use Indexes.index (combines index! and cache!), or Indexes[books].index, or Indexes[books][title].index.
  • hanke: (server) Get Indexes/Categories using the #[] method. E.g. Indexes[:books] to get the :books index, and Indexes[:books][:author] to get the :author category of the :books index.
  • hanke: (server) Indexes, Indexes[:some_index], and Indexes[:some_index][:some_category] now all support
the following methods:
  • #index (just index: prepare data and cache data)
  • #reload (just reload the cached data into the server, no effect on Redis indexes)
  • #reindex (index and reload one category after another)
Note: #reload and #reindex only make sense in a running server with memory indexes. Examples:
  • Indexes.index (index all indexes, randomly)
  • Indexes[:some_index].reindex (reindex that index)
  • Indexes[:some_index][:some_category].reload (just reload that category)

Version 2.5.2

  • hanke: (server) Fixed: Redis indexing. Old values are now removed on reindexing.

Version 2.5.1

  • hanke: (server) Minor changes.

Version 2.5.0

  • hanke: (server) Searches can now search in multiple qualifiers, separating them by a “,”. E.g. name,street:tyne.
  • hanke: (server) Searches will no longer search in all categories (fields) if a qualifier has been mistyped. So, namme:peter will not search in all categories, but instead return an empty result if category namme does not exist.

Version 2.4.3

  • hanke: (server) Fixed: Indexing a single category where a #each source was used using rake index:specific[index,category] raised an error.

Version 2.4.2

  • hanke: (server) Live interface for picky-live gem fixed.

Version 2.4.1

  • hanke: (server) Fixes Redis indexing.
  • hanke: (client) Requires activesupport (thanks stanley!).

Version 2.4.0

  • hanke: (server) Added a configuration option key_format for index, categories. It sets the format that this index’/category’s keys are in. Use as you would with source, as either method in the index block, as index parameter, or category parameter.
  • hanke: (client) The client is now finally really data driven by the server, see next changes.
  • hanke: (client) Added two options for the PickyClient, fullResults and liveResults. It designates how many results should be rendered. Defaults are for full: 20, and for live: 0.
  • hanke: (client) The Convenience#ids method now by default returns all ids returned from the server.
  • hanke: (client) The Convenience#populate_with’s second param is not the amount of populated ids anymore. Instead it populates all returned ids by default. If you want less, pass in the up_to option. So, e.g. results.populate_with :up_to => 20.

Version 2.3.0

  • hanke: (server) Integration specs in the server are now easy. In your specs, require 'picky-client/spec'. Example: it {'alan').ids.should == [259, 307, 449] }.
  • hanke: (generators) Added integration specs that use the above tests & matchers to the generated example app.
  • hanke: (client) Added Picky::TestClient which can be used in the server for integration specs. Use, :path => '/your_search_url'), then'bla', :ids => 12, :offset => 0).ids.should ==== [1,3,4] or'blu bli').should have_categories(['title', 'author'], ['title', 'title']) to test category result combinations and order.

Version 2.2.1

  • hanke: (server) Very simple geo search that works best in temperate areas. If you’re just looking for results that are close to yours, give it a go. Use #geo_categories(lat, lng, radius_in_kilometers, options = {})

Version 2.2.0

  • hanke: (server) (BREAKING CHANGE) Since I prefer the block style configuration for indexes, the source is now an optional parameter. Picky will tell you if you still use the old style. New is that you can define the source of an index in the block, e.g.: do source end
  • hanke: (server) Sources can now be anything that responds to #each and that returns objects that respond to #id. (That means you can just pass in an array, or MongoMapper or ActiveRecord’s Book.order('updated_at DESC') or similar)
  • hanke: (server) The app/application.rb API has gotten a few aliases: default_indexing and default_querying can now be called with indexing or searching.
  • hanke: (server) Each index can now have its own indexing. Use e.g. do indexing removes_characters: /[^a-z]/i end.
  • hanke: (server) Each Search can now have its own “searching”, e.g.: do searching removes_characters: /[^a-z]/i end
  • hanke: (server) Added option for collaborators (on the Picky server) of setting the performance ratio if the performance specs fail too often. Just add a spec/performance_ratio.rb file with the content module Picky; PerformanceRatio = x.xx end. Less than 1.0 is more benign, more than 1.0 is harsher.

Version 2.1.2

  • hanke: (server) Improved rake search <url> [<result id amount>] with better description and error handling.

Version 2.1.1

  • hanke: (server) rake search <url>, a simple experimental terminal search interface.

Version 2.1.0

  • hanke: (server) Tokenizing completely rewritten. It works now almost the same in indexing and in querying, with the exception of downcasing (or not, for case sensitive searches).
  • hanke: (server) Indexing and querying now don’t downcase anymore right at the beginning of processing, but rather after text preprocessing. For you this means that you need to use case insensitive regexps /…/i in the config if you need a case sensitive search (get it?).
  • hanke: (server) default_indexing and default_querying offer a new option, case_sensitive, which is by default false. Set it in indexing and querying to true to have your search be case sensitive (usually it is a good idea to set them both to the same case sensitivity). Watch the regexp options – possibly best if you set them to case insensitive /…/i.

Version 2.0.0

  • hanke: Let’s go live, wohoo! :) See the prerelease history notes for all changes.

Version 2.0.0.pre3

  • hanke: (server) Renamed Similarity::DoubleLevenshtone (aka Similarity::Phonetic) to Similarity::DoubleMetaphone (BREAKING: Cannot use Similarity::Phonetic anymore).
  • hanke: (server) Added Similarity::Soundex.
  • hanke: (server) Added Similarity::Metaphone.

Version 2.0.0.pre2

  • hanke: (client) Asterisks are correctly escaped before saved in the browser history.
  • you: Give feedback, thanks! :)

Version 2.0.0.pre1

  • hanke: New major version number – see reasons for API change:
  • hanke: (server) (Breaking change) Query::Full and Query::Live have been replaced by just Search. So what you now do is route /something/ =>, index2, ..., options).
  • hanke: (server) Pass in the ids param to define the amount of result ids you’d like. This is how you’d do it with curl: curl 'localhost:8080/books?query=test&ids=20'. 20 ids is the default.
  • hanke: (client) (Breaking change) Picky::Client::Full and Picky::Client::Live have been replaced by Picky::Client. New option: ids. Pass in to define the amount of ids you’d like. For a live query you need none, so pass in 0. (20 is the default in the server)
  • hanke: (generated clients) See client changes above. Replace Picky::Client::Full and Picky::Client::Live with just a single Picky::Client instance with the same options as before (but just a single URL on the server as desribed above).
  • hanke: (generated servers) See server changes above. Replace Query::Full and Query::Live instance pairs by just a single Search instance.
  • hanke: (client) Added rake javascripts, rake update to the client and client project generator which copies the javascripts from the client gem into your directory. (If you have an old generated project, add require 'picky-client/tasks'; in your Rakefile)

Version 1.5.4

  • hanke: (client) Not breaking the web anymore ;) Using history.js instead of address.js to do away with the hash bang.

Version 1.5.3

  • hanke: (server) rake stats and rake analyze. Get information about your app.

Version 1.5.2

  • hanke: (server) When indexing from the database, the intermediary snapshot table is now called "picky_#{index.identifier}_index" instead of "#{index.identifier}_type_index" to be clearer that it is Picky creating these tables, and what it is. You can remove the …_type_index tables.
  • hanke: (server) The database source now uses mostly AR adapter methods to make it more agnostic.

Version 1.5.1

  • hanke: (server) Picky now traverses more cleanly over your database data. (Thanks Jason Botwick!)

Version 1.5.0

  • hanke: (server) Redis backend.
  • hanke: (server) The Redis backend uses db 15.
  • hanke: (server) The mysql gem is used by default.

Version 1.4.3

  • hanke: (server) Fix for non-working picky command line interface. (Thanks Jason Botwick!)

Version 1.4.2 (Redis backend prerelease)

  • hanke: (server) Redis backend prototype.
  • hanke: (server) rake index:specific[index] or rake index:specific[index,category] to index just a specific index or category.
  • hanke: (server) Postgres source better handled.

Version 1.4.1

  • hanke: (client/generators) The choices option is now localized. If you have generated a new Picky project with 1.4.0, please do localize your choices like so: choices:{ (formats here) } => choices:{en:{ (formats here) }} and whatever locales you’d like to use.

Version 1.4.0

  • hanke: (client/generators) Latest Javascript PickyClient object includes the option to format the choices better, option group: [['author', 'title', 'subjects'], ['publisher']] lets you group certain categories together while option choices: { 'title': format: "<strong>%1$s</strong>", filter: function(text) { return text.toUpperCase(); }, ignoreSingle: false } lets you define how each combination is handled in detail. Again, hard to explain, easy to see. (see issue for details, closes issue 9)
  • hanke: (client/generators) Added a wrapResults options where you can define wrapper HTML bits that are wrapped around each allocation group of <li> results. The default is: wrapResults: '<ol class="results"></ol>'.
  • hanke: (client/generators) Headers are now contracted, this means no more “written by florian and written by hanke”, but “written by florian hanke”. (closes issue 10)
  • hanke: (client) Split #interface method into => #input, #results, so that users can wrap each with custom elements. Don’t forget to wrap into a div#picky.
  • hanke: (generators, breaking change!) Example now constricts the Picky interface width using a div.content. Please use a wrapper div to constrict div#picky.
  • hanke: (generators) Cleanup of Javascript code, inclusion of formerly external javascripts (scrollTo, timer, jQuery 1.5).
  • hanke: (generators, possible breaking change!) Interface HTML structure refactor. Results should now be li-s. Result groups (combinations/allocations, around the result li-s) are each inside an ol.results. Please check your CSS files if they need to be adapted to the new structure.
  • hanke: (generators) Cleanup of CSS, much more flexible and specific.

Version 1.3.4

  • hanke: (generators/client) In the generated Sinatra client, queries can be passed in through the query param q. Example:
  • hanke: (generators/client) In the generated sinatra client, the back/forward buttons work via jquery.address plugin. Closes github issue 6.

Version 1.3.3

  • hanke: (server/client) Server now sends the similar word instead of the original in similarity tokens (semelor~ → similar). Even if that means, that the original way of writing is not preserved (SEmElOr~ → similar). We’re trying to help people have good searches, so there.

Version 1.3.2

  • hanke: (all) Fixed description in the “picky” command. Also now shows optional parameters more clearly.

Version 1.3.1

  • hanke: (server) Ability to handle string/symbol keys (for future key/value store data sources).
  • hanke: (server) Live interface uses select instead of sleep in the master process.

Version 1.3.0

  • hanke: (server) Offers a new routing API, an interface that permits changing parameters in the running server. Use route %r{/admin} =>
  • hanke: (statistics) The statistics server is now called “Clam”, a chain smoking friend of Picky’s.
  • hanke: (live) A new Gem “picky-live” that offers a live interface into the Picky server, provided you have a route for it. It is called “Suckerfish”, and is one of Picky’s friends, too.

Version 1.2.4

  • hanke: (server) default_indexing (in the application.rb) provides a new option reject_token_if => some_lambda, e.g.: reject_token_if: lambda { |token| token.nil? || token == :hello } where you can define which tokens go into the index, and which do not. Default lambda is: &:empty?. This means that only non-empty tokens are saved in the index. You could, for example, not save tokens that have length < 2 (since they might be too small for your purposes). Note that tokens are passed into the hash as symbols.
  • hanke: (statistics) Fixed a bug where the last line in the log file was counted once a second time after reloading the stats.
  • hanke: (statistics) Slight interface redesign.

Version 1.2.3

  • hanke: (server) Fixed a bug where the partial strategy Partial::None was not correctly used: A query like Peter did not return results even if “Peter” could be found using quotes: “Peter” (FYI, double quotes force Picky to use the exact index instead of the partial one. While, conversely, the asterisk* forces Picky to use the partial index instead of the exact one).

Version 1.2.2

  • hanke: (statistics) Statistics server handles logfile reading in a cleaner way when the gem has been installed by root.

Version 1.2.1

  • hanke: (statistics) (BETA) New statistics gem for Picky. Run picky stats path/to/your/search.log [port] to start a statistics server. Go to http://localhost:4567 after running the command to take a look.

Version 1.2.0

  • hanke: (client) (BREAKING) => ‘bla’) has changed to‘bla’), as the query itself is not optional. The rest of the options is still passed in as a Hash through the second parameter.

Version 1.1.7 (1.2.0 pre)

  • hanke: (server) Redefined API for 1.1.6 beta feature, ranged search.
  • hanke: (documentation) API for #define_ranged_category.

Version 1.1.6

  • hanke: (server) Enabled beta feature “low/high limited range search”, see API RDoc (IndexAPI class).

Version 1.1.5

  • hanke: (server) Passing in a similarity search (e.g. with text “hello”) will never return “hello” as a similar token.

Version 1.1.4

  • hanke: (generators) Removed unnecessary jquery-1.3.2 from client, since it wasn’t referenced anyway.

Version 1.1.3

  • hanke: (server) The CouchDB source now uses a little trick/hack to make its ids work in Picky. They are translated into decimal numbers from its hex string representations. Recalculate using #to_s(16) before getting objects from CouchDB in a webapp.

Version 1.1.2

  • hanke: (generators) Added generator for empty unicorn projects, use picky generate empty_unicorn_project <project_name> to generate one.

Version 1.1.1

  • hanke: (server and client) Removed generator projects that have been moved to picky-generators. Gems now much smaller :)

Version 1.1.0

  • hanke: (server and client) Generators extracted into picky-generators gem.
  • hanke: (generators) Generators and example projects for both server and client.

Version 1.0.0

  • hanke: Lots of API RDoc.
  • hanke: Yaaaay! Finally :)

Version 0.12.3 (1.0.0 pre4)

  • hanke: (server) Fixed cased file name (led to problems under Linux, thanks Bernd Schoeller)

Version 0.12.2 (1.0.0 pre3)

  • hanke: (server) New :from option. Assume you have a source, file:'some_file.csv') but you want the category to be called differently. Use the from option as follows: define_category(:similar_title, :from => :title).
  • hanke: (server) CSV source uses FasterCSV, passing through all its options (col_sep, row_sep et cetera).
  • hanke: (server) More understandable output for rake try, rake try:index, rake try:query.

Version 0.12.1 (1.0.0 pre2)

  • hanke: (server) Fixed a bug where the default qualifier definition (like the author in the query author:tolkien) for categories were ignored.

Version 0.12.0 (1.0.0 pre1)

  • hanke: (server) API change in application.rb: Use #define_category instead of #category on an index. (category still possible, but deprecated)
  • hanke: (server) Internal rewrite.

Version 0.11.2

  • hanke: (server) Rake task index:check will check if all necessary index files are generated. (Nice to use before restarting.)

Version 0.11.1

  • hanke: (server) Better error reporting in Rake tasks. Task naming improved.
  • hanke: (server) Internal cleanup.

Version 0.11.0

  • hanke: (server) Major API and internals rewrite. See generated project for help.

Version 0.10.5

  • hanke: (server) Source CouchDB added (thanks to

Version 0.10.4

  • hanke: (server) Typo fixed (thanks to

Version 0.10.3

  • hanke: (client) Helpful configuration page in the client at /configure.

Version 0.10.2

  • hanke: (server) Phonetic similarity (e.g. lyterature~) available through, see example.

Version 0.10.1

  • hanke: (server) :weights option for queries also ok in the form: { [:cat1, :cat2] => 4 }, where 4 is any weight.

Version 0.10.0

  • hanke: (server) (BREAKING) Total rewrite/exploration of the Application API.
    Stay on 0.9.4 if you don’t want to update right now.
  • hanke: (server) Character substitution now configurable. Default is no character substitution.

Version 0.9.4

  • hanke: (server) rake routes: Shows all current URL paths, and if they are processable fast.

Version 0.9.3

  • hanke: (server) Fixed: Querying parameters are not ignored anymore.

Version 0.9.2

  • hanke: (client) Fixed result_hash.entries to return the right amount of entries.
  • hanke: (client) The result_hash#entries now takes a block and replaces the e.g. AR instances with e.g rendered results.
  • hanke: (client) Locale handling fixed. Uses the locale of the HTML tag by default.

Version 0.9.1

  • hanke: (server) Delicious missing gem notice if www-delicious gem is missing.
  • hanke: (server)Partial::Subtoken renamed to Partial::Substring.
    Options: down_to → from, starting_at → to
  • hanke: (server) Index bundle file handling extracted into specific Index::Files backend.

Version 0.9.0

  • hanke: (server/client) Jump to 0.9.0 to work on API, release 1.0.0 soon.
  • hanke: (server) Partial indexing now only down to -3, e.g. florian → partial: floria, flori, flor.
    If you want down_to the first character (florian, floria, flori, flor, flo, fl, f), use:
    field(:some_field_name, :partial => => 1))
  • hanke: (server), pass) for indexing your delicious posts.
  • hanke: (server) indexing and querying config now done on tokenizer instances.

Version 0.3.1

  • hanke: (server) Generator gives more informative NoGeneratorError message.

Version 0.3.0

  • hanke: (server) Uses json (index, index weights) and marshal (similarity index) to dump indexes.
  • hanke: (server) Generator is more helpful (thanks to
  • hanke: (server) Generator for a Sinatra project. (picky-client sinatra project_name <- Note: Changed to picky generate sinatra_client project_name)
  • hanke: (client) Helpful generator. (thanks to

Version 0.2.4

  • hanke: (server) Indexing output, output in general cleaned up.
  • hanke: (server) Better info after generating a new project (thanks kschiess).
  • hanke: (server) Indexer now uses json for the dump files (much faster, slightly larger, thanks to
  • hanke: (client) JS files rewritten.

Version 0.2.3

  • hanke: (server) Explicit index buffering: Indexer hits filesystem only seldomly.
  • hanke: (server) Internal rename from full index to exact index (visible in index filenames).
  • hanke: (server) Solr Indexing removed until someone needs it. Then we’ll talk cash. Just kidding.
  • hanke: (server) Improved Gemfile.

Version 0.2.2

  • hanke: (server) Umlaut handling (i.e. character substitution) now pluggable.
  • hanke: (server) Apps finalization now handled through Ruby callback (thanks to

Version 0.2.1

  • hanke: (server) Fix for negative partial index values (:partial => => -3))

Version 0.2.0

  • hanke: (server) Only uses JSON to encode results.
  • hanke: (client) Only uses JSON for full and partial queries.

Version 0.1.0

  • hanke: (server) Application interface rewrite. See a freshly created
    project (using picky project <- Note: Renamed picky generate unicorn_server ).

Version 0.0.9

  • hanke: (client) Cleanup. Frontend example.

Version 0.0.8

  • hanke: (server) Application#add_index instead of Application#type.
  • hanke: (server) Simplified scaffolding.

Version 0.0.7

  • hanke: (server) Gem compiles on install. Do not compile on run.

Version 0.0.6

  • hanke: (server) Removed unnecessary gem dependencies (thanks to niko).
  • hanke: (server) Added CSV to the possible Sources., :author, :isbn, :file => ‘data/books.csv’),
  • hanke: (server) Renamed all instances of SEARCH_* constants to PICKY_*. (Uses RACK_ENV)

Version 0.0.5

  • hanke: (server), now top level in newly created project (more standard).
  • hanke: (server) Port now defined in (use listen ‘host:port’).
  • hanke: (client) Enriched callbacks in the JS interface definition (before, success, after).

Version 0.0.4

  • hanke: (client) Interface now created using Picky::Helper.interface or .cached_interface (if you only have a single language in your app).

Version 0.0.3

  • hanke: (server) C-Code cleaned up, removed warnings.

Version 0.0.2

  • hanke: (server) Newly created application better documented.

Version 0.0.1

  • hanke: (server/client) Initial project. Server (picky) and basic frontend client (picky-client) available.