dyoder / functor

Implements pattern-based method dispatch for Ruby, inspired by Topher Cyll's multi.

This URL has Read+Write access

name age message
file .gitignore Loading commit data...
file README
file Rakefile
directory doc/
file functor.gemspec
directory lib/
directory metrics/
directory test/
README
Functor provides pattern-based function and method dispatch for Ruby, originally inspired by Topher Cyll's multi gem. 

= Method Functors

To use it in a class:

  class Repeater
    attr_accessor :times
    include Functor::Method
    functor( :repeat, Integer ) { |x| x * @times }
    functor( :repeat, String ) { |s| [].fill( s, 0, @times ).join(' ') }
  end

  r = Repeater.new
  r.times = 5
  r.repeat( 5 )   # => 25
  r.repeat( "-" ) # => "- - - - -"
  r.repeat( 7.3 ) # => Functor::NoMatch!

= Stand-Alone Functors

You can also define Functor objects directly:

  fib ||= Functor.new do |f|
    f.given( Integer ) { | n | f.call( n - 1 ) + f.call( n - 2 ) }
    f.given( 0 ) { |x| 0 }
    f.given( 1 ) { |x| 1 }
  end
  
You can use functors directly with functions taking a block like this:

  [ *0..10 ].map( &fib )  # => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
  
You can call a functor as a method using #call:

  fun.call( obj, 7 )
  
= Pattern Matching

Arguments are matched first using #===, so anything that supports these methods can be matched against. In addition, you 
may pass as a "guard" any object that responds to #call and which takes an object (the argument) and return true or 
false. This allows you to do things like this:

  stripe ||= Functor.new do
    given( lambda { |x| x % 2 == 0 } ) { |x| 'white' }
    given( lambda { |x| x % 2 == 1 } ) { |x| 'silver' }
  end
  
  stripe.call( 3 ) # => 'silver'
  stripe.call( 4 ) # => 'white'

= Precedence

Precedence works similarly to Ruby method definition, i.e. Last In, First Out. Thus, you need to be careful in how you 
define your functor. The Fibonacci example above would not work properly if the Integer pattern was given last.

= Caching Options

Methods defined with functors are substantially slower than methods defined natively with "def".  To (partially) 
alleviate the performance hit, Functor keeps track of which functor block matches each particular *args set, allowing it 
to short circuit the matching process when an *args set is seen again.  If the *args cache were unlimited, this would 
represent a serious memory leak.  Naturally, we have not implemented an unlimited cache.

Rather, the *args cache is subjected to flexible and scalable limits that also have the effect of promoting frequently 
encountered *args sets and allowing those less frequently encountered to languish, yea even unto death.  The caching 
system uses two configuration parameters, which may be set for Functor or for any class which has included 
Functor::Method.  When a class does not set these parameters, it will inherit the options set at the Functor level.  
Functor comes with (what we think are) sensible defaults.

The :size parameter controls, albeit indirectly, the number of items the caching system may store.  If your :size is 
greater than the number of *args sets the functor will encounter, you do not need to worry (or even know) about the 
other parameter.  You can set the :size param thusly:

  Functor.cache_config :size => 10_000
  
  class A
    include Functor::Method
    functor_cache_config :size => 700
  end

The other parameter, :base, determines the thresholds for promotion, i.e. how many times an *args set must be seen 
before it "becomes more important". While the use of the :size parameter is somewhat straightforward (albeit indirect), 
you need some idea of how the caching system works to understand the :base param.  Functor classes maintain cached *args 
sets in tiers.  When attempting to short-circuit the matching process, Functor checks each of these caches, starting at 
the top and working downward.  Each tier has a promotion threshold, which represents the number of hits an item must 
receive before it can jump to the next tier.  The promotion thresholds are determined using exponents of the :base 
parameter.  The threshold for exiting the lowest tier is (base ** 1); for the next tier, it is (base ** 2); for the 
next, (base ** 3).  

Assume the following configuration:  

  Functor.cache_config :base => 10

Functor currently uses 4 tiers. Let us call them c0, c1, c2, and c3.  To pass from c0 to c1, an item must receive 10 
(i.e. 10**1) hits. To pass from c1 to c2, it must receive 100 (10**2) hits.  For promotion to c3, it must receive 1,000 
hits.  A lower :base setting results in lower thresholds across all tiers.  The thresholds for a base of 8 would be 8, 
64, and 512.  Again, if your cache :size is large enough, the promotion thresholds are irrelevant.  If, however, the 
number of distinct *args sets you expect to encounter is larger than your desired cache size, the :base parameter allows 
you to tune the promotion thresholds for a better fit with the frequency distribution of the *args sets.  A high :base 
setting means that only very high-frequency items will make it into the top cache tier.

Tier size limits are enforced in a useful way. Each tier has a size limit. When that size is reached, all of its items 
are dropped into the next lower tier, which drops all of its items into the next lower tier, etc.  Tier size limits 
increase from top to bottom, so that when c3 dumps its items into c2, ample space remains in c2.  After a tier cascade, 
all of the items in a particular tier have a hit count higher than the promotion threshold, guaranteeing that the next 
hit will promote the item.  The effect of this is to allow "languishing" items at a certain level to eventually drop out 
the bottom, while preserving the valuable, active, high-frequency items.

= Credits And Support

Functor was written by Dan Yoder, Matthew King, and Lawrence Pit. Send email to dan at zeraweb.com for support or 
questions.