Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Adding UCB1 functionality

  • Loading branch information...
commit 2bb909617a563e46b7d44d970a76be7d357b4893 1 parent c58b8f9
Edward Weng wengzilla authored committed
1  lib/bandit.rb
View
@@ -9,6 +9,7 @@
require "bandit/players/round_robin"
require "bandit/players/epsilon_greedy"
require "bandit/players/softmax"
+require "bandit/players/ucb"
require "bandit/storage/base"
require "bandit/storage/memory"
4 lib/bandit/experiment.rb
View
@@ -50,6 +50,10 @@ def participant_count(alt, date_hour=nil)
@storage.participant_count(self, alt, date_hour)
end
+ def total_participant_count(date_hour=nil)
+ @storage.total_participant_count(self, date_hour)
+ end
+
def conversion_rate(alt)
pcount = participant_count(alt)
ccount = conversion_count(alt)
1  lib/bandit/players/base.rb
View
@@ -7,6 +7,7 @@ def self.get_player(name, config)
when :round_robin then RoundRobinPlayer.new(config)
when :epsilon_greedy then EpsilonGreedyPlayer.new(config)
when :softmax then SoftmaxPlayer.new(config)
+ when :ucb then UcbPlayer.new(config)
else raise UnknownPlayerEngineError, "#{name} not a known player type"
end
end
31 lib/bandit/players/ucb.rb
View
@@ -0,0 +1,31 @@
+module Bandit
+ class UcbPlayer < BasePlayer
+ include Memoizable
+
+ def choose_alternative(experiment)
+ best_alternative(experiment)
+ end
+
+ def best_alternative(experiment)
+ best = nil
+ best_rate = nil
+ experiment.alternatives.each { |alt|
+ rate = experiment.conversion_rate(alt) + confidence_interval(experiment, alt)
+ if best_rate.nil? or rate > best_rate
+ best = alt
+ best_rate = rate
+ end
+ }
+ best
+ end
+
+ def confidence_interval(experiment, alt, date_hour=nil)
+ # force alt_participant_count to start at 1 to avoid divide by 0 errors
+ # force total_participant_count to start at 1 to avoid taking log of 0 errors
+ total_participant_count = [experiment.total_participant_count(date_hour), 1].max
+ alt_participant_count = [experiment.participant_count(alt, date_hour), 1].max
+ # scale to 100 to match conversion_rate output
+ Math.sqrt(2 * Math.log(total_participant_count) / alt_participant_count) * 100
+ end
+ end
+end
6 lib/bandit/storage/base.rb
View
@@ -68,6 +68,12 @@ def incr_conversions(experiment, alternative, count=1, date_hour=nil)
incr conv_key(experiment, alternative, date_hour || DateHour.now), count
end
+ def total_participant_count(experiment, date_hour=nil)
+ experiment.alternatives.inject(0) do |tpc, alternative|
+ tpc + participant_count(experiment, alternative, date_hour)
+ end
+ end
+
# if date_hour isn't specified, get total count
# if date_hour is specified, return count for DateHour
def participant_count(experiment, alternative, date_hour=nil)
3  players.rdoc
View
@@ -1,6 +1,9 @@
= Bandit Players
There are a number of different possible players that each seek to explore/exploit using different methods. Each can be configured in the *bandit.yml* file in the config directory under your *Rails.root*.
+== UCB
+The UCB player calculates the best alternative based on the sum of its conversion rate and its calculated confidence interval. The player will automatically shift between an experimental phase and a non-experimental phase based on its confidence that it has found the optimal strategy. See the following blog post for more details: http://www.chrisstucchio.com/blog/2012/bandit_algorithms_vs_ab.html
+
== Epsilon Greedy
The epsilon greedy player selects the best alternative with a probability of 1 - epsilon, and selects uniformly among the other alternatives with a probability of epsilon. You can set the value of epsilon in the config file like this:
Please sign in to comment.
Something went wrong with that request. Please try again.