Bayesian averages for voting/rankings systems in Rails.
Ruby
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
lib
spec
.rspec
Gemfile
Gemfile.lock
README.md
bayesian_average.gemspec

README.md

Bayesian Average

Code Climate

A (work in progress) gem for adding Bayesian averages to your Rails projects. What is a Bayesian average?

tl;dr - Get rid of the issues caused by averages (means) of small datasets or datasets with outliars.

A Bayesian average is a method of estimating the mean of a population consistent with Bayesian interpretation, where instead of estimating the mean strictly from the available data set, other existing information related to that data set may also be incorporated into the calculation in order to minimize the impact of large deviations, or to assert a default value when the data set is small.

For example, in a calculation of an average review score of a book where only two reviews are available, both giving scores of 10, a normal average score would be 10. However, as only two reviews are available, 10 may not represent the true average had more reviews been available. The review site may instead calculate a Bayesian average of this score by adding the average review score of all books in the store to the calculation. For example, by adding five scores of 7 each, the Bayesian average becomes 7.86 instead of 10, which the review site would hope that it will better represent the quality of the book.

Taken from the Wikipedia article.

Dependencies

Currently the project is dependent on Mongoid. I might add AR support later.

Use

Gemfile:

gem "bayesian_average", "~> 0.1.1"

In your model being ranked:

class Movie
  include Mongoid::Document
  include Mongoid::BayesianParent
  
  has_many :rankings
  
  bayesian_parent_for :ranking, weight: 100
  
  def bayesian_collection
    Movie.all
  end
end

In your model representing the rankings:

class Ranking
  include Mongoid::Document
  include Mongoid::BayesianChild
  
  belongs_to :movie
  
  field :score, type: Integer, default: 0
  
  bayesian_child_for :movie, field: :score
end

Here we define the parent and child. Let's look at the parent first.

The parent must have_many of the child objects. The line bayesian_parent_for :ranking, weight: 100 signifies that objects of the Ranking class hold the scores, and that the average of the collection will have the weight of weight objects. At this point parents can only have Bayesian scores for one class. The method definition is important. It signifies what the Bayesian score is based off of. In this case, the average of all the movies will be taken into account, but defining this method allows you to define more appropriate subsets. For example, if the Movie class has a Director, then director.movies might be a more reliable mean, since the movies by a particular director tend to be of a certain quality. Keep in mind that if this dataset is small, it will defeat the purpose of a Bayesain average, thus something like this might be better:

def bayesian_collection
  director.movies.count >= 5 ? director.movies : Movie.all
end

The child class only has one interesting line in it. bayesian_child_for :movie, field: :score simply denotes which field should be used in the average, and which class it is scoring.

This exposes the following method:

movie.bayesian_average #=> Float

Also, you will get a method to update your existing database:

Movie.all.each do { |movie| movie.update_bayesian }

Keep in mind that this will put a large load on your database. You probably want to do this a small section at a time and asynchronously with Resque or something similar.

How It Works

This gem will store two fields on your parent model, num_bayesian_children and num_bayesian_points. Instead of storing a float, it will keep these fields to prevent rounding errors from propagating over the lifetime of your application.

The child model gets a before_create that increments the parent model atomically to update the number of children and number of total points.

License (MIT)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.