Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
An implementation of the quotient cube algorithm
Ruby
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
events
lib
papers
spec
vendor
.gitignore
.gitmodules
NOTES.md
README
Rakefile
TODO
init.rb

README

Quotient Cube is an algorithm and strategy for storing and querying data cubes.

This is a pure ruby implementation of that algorithm.

Requires tokyocabinet, autobots-transform

It is currently powering a personal analytics project and I think it is stable
enough (under some assumptions) for test and personal usage. Can also be used
as a tool for learning how this particular algorithm works.

Right now there is no support for incremental updates so everything is done
in batch. Smaller tables say with <10 dimensions can be built relatively 
quickly (a few thousand rows/second) so this is appropriate for daily batches
consisting of several million rows. Once the cube is pre-computed querying
is very fast. Point queries on a small tree can be done at the rate of 4000-5000
per second. While range queries can be performed at rate of about 1000 per second.

Example:

require 'tokyocabinet'
require 'quotient-cube'

# 1) Base Data Table

@table = Table.new(
  :column_names => [
    'store', 'product', 'season', 'sales'
  ], :data => [
    ['S1', 'P1', 's', 6],
    ['S1', 'P2', 's', 12],
    ['S2', 'P1', 'f', 9]
  ]
)

@measures = ['sales[sum]', 'sales[avg]']
@dimensions = ['store', 'product', 'season']

# 2) Build quotient cube data structure

cube = QuotientCube::Base.build(@table, @dimensions, @measures) do |table, pointers|
  sum = 0
  pointers.each do |pointer|
    sum += table[pointer]['sales']
  end

  [sum, sum / pointers.length]
end

# 3) Write out Quotient Cube Tree (QC-Tree) to disk

database = BDB.new
database.open(data_path, BDB::OWRITER | BDB::OCREAT)

Tree::Builder.new(database, cube).build

database.close

# Querying the QC-Tree

database.open
tree = Tree::Base.new(database)

# Point Query

@tree.find('sales[avg]', :conditions => {'product' => 'P1', 'season' => 'f'})
#=> {'product' => 'P1', 'season' => 'f', 'sales[avg]' => 9.0}

# Range Query

@tree.find('sales[avg]', :conditions => {'product' => ['P1', 'P2', 'P3']})
#=> [{'product' => 'P1', 'sales[avg]' => 7.5}, 
     {'product' => 'P2', 'sales[avg]' => 12.0}]

# Range Query

@tree.find(:all, :conditions => {'product' => :all})
#=> [
  {'product' => 'P1', 'sales[avg]' => 7.5, 'sales[sum]' => 15.0}, 
  {'product' => 'P2', 'sales[avg]' => 12.0, 'sales[sum]' => 12.0}
]
Something went wrong with that request. Please try again.