# besquared/quotient-cube

An implementation of the quotient cube algorithm
Ruby
Switch branches/tags
Nothing to show
Fetching latest commitâ€¦
Cannot retrieve the latest commit at this time.
 Failed to load latest commit information. events lib papers spec vendor .gitignore .gitmodules NOTES.md README Rakefile TODO init.rb

```Quotient Cube is an algorithm and strategy for storing and querying data cubes.

This is a pure ruby implementation of that algorithm.

Requires tokyocabinet, autobots-transform

It is currently powering a personal analytics project and I think it is stable
enough (under some assumptions) for test and personal usage. Can also be used
as a tool for learning how this particular algorithm works.

Right now there is no support for incremental updates so everything is done
in batch. Smaller tables say with <10 dimensions can be built relatively
quickly (a few thousand rows/second) so this is appropriate for daily batches
consisting of several million rows. Once the cube is pre-computed querying
is very fast. Point queries on a small tree can be done at the rate of 4000-5000
per second. While range queries can be performed at rate of about 1000 per second.

Example:

require 'tokyocabinet'
require 'quotient-cube'

# 1) Base Data Table

@table = Table.new(
:column_names => [
'store', 'product', 'season', 'sales'
], :data => [
['S1', 'P1', 's', 6],
['S1', 'P2', 's', 12],
['S2', 'P1', 'f', 9]
]
)

@measures = ['sales[sum]', 'sales[avg]']
@dimensions = ['store', 'product', 'season']

# 2) Build quotient cube data structure

cube = QuotientCube::Base.build(@table, @dimensions, @measures) do |table, pointers|
sum = 0
pointers.each do |pointer|
sum += table[pointer]['sales']
end

[sum, sum / pointers.length]
end

# 3) Write out Quotient Cube Tree (QC-Tree) to disk

database = BDB.new
database.open(data_path, BDB::OWRITER | BDB::OCREAT)

Tree::Builder.new(database, cube).build

database.close

# Querying the QC-Tree

database.open
tree = Tree::Base.new(database)

# Point Query

@tree.find('sales[avg]', :conditions => {'product' => 'P1', 'season' => 'f'})
#=> {'product' => 'P1', 'season' => 'f', 'sales[avg]' => 9.0}

# Range Query

@tree.find('sales[avg]', :conditions => {'product' => ['P1', 'P2', 'P3']})
#=> [{'product' => 'P1', 'sales[avg]' => 7.5},
{'product' => 'P2', 'sales[avg]' => 12.0}]

# Range Query

@tree.find(:all, :conditions => {'product' => :all})
#=> [
{'product' => 'P1', 'sales[avg]' => 7.5, 'sales[sum]' => 15.0},
{'product' => 'P2', 'sales[avg]' => 12.0, 'sales[sum]' => 12.0}
]```