Simplifies querying of structured data using relational algebra
The purpose of this project is to expand my knowledge of relational algebra by attempting to implement a simple query system using the primitive operations defined in relational algebra.
Most of the design is heavily inspired from koios and arel. The reason I decided to write my own and not just build on top of those systems was not so much because I don’t like the code/API in those projects, it’s more because I wanted to gain a depth of understanding that can only be earned by trying to solve the problem myself.
The first phase of this project will be to implement all the operations listed below using in-memory data structures. I’m focusing on the API, and making sure the specs ensure the desired results are obtained from each operation.
The second phase of this project will be to add an RDBMS based engine, and move the in-memory matching to it’s own engine. I’ll also be working on a system where if the primary engine cannot carry out some operation, that it first look at alternate forms (eg. using a join instead of an intersection), and then fall-back to in-memory matching. I also want to look at re-arranging queries so that all the operations that can be performed natively are “pushed down” the hierarchy and then the in-memory matching is performed last.
The third phase of this project will be to add a few NoSQL engines (like MongoDB and CouchDB) and then look at writing a DataMapper adapter that translates Query objects into Veritas relations. I want to make sure all the DM specs pass with this adapter and each engine, and if everything goes well I will look at updating DM to work directly on top of Veritas.
relation = Veritas::Relation.new([ [ :id, Integer ] ], [ [ 1 ] ]) # Relational Operators # projection new_relation = relation.project([ :a, :b, :c ]) # new_relation has only attributes a, b and c # removal new_relation = relation.remove([ :a, :b, :c ]) # new_relation has attributes except a, b and c # rename new_relation = relation.rename(:a => :b, :c => :d) # theta-join new_relation = relation.join(other, relation[:a].gte(other[:b])) new_relation = relation.join(other) { |r| r[:a].gte(r[:b]) } # NOTE: theta-join is effectively restricting a product of the relations # natural join new_relation = relation.join(other) # OR relation + other # product new_relation = relation.product(other) # OR relation * other # intersection new_relation = relation.intersect(other) # OR relation & other # union new_relation = relation.union(other) # OR relation | other # difference new_relation = relation.difference(other) # OR relation - other # restriction new_relation = relation.restrict(relation.name.eq('other')) new_relation = relation.restrict { |r| r[:a].eq('other').and(r[:b].gte(42)) } # Non-Relational Operators # sort by specific attributes new_relation = relation.order { |r| [ r[:a].desc, r[:b] ] } # limiting (only allowed if ordered) new_relation = relation.limit(10) # offset (only allowed if ordered) new_relation = relation.offset(0) # NOTE: a relation is Enumerable relation = relation.each { |tuple| ... } # returns a Set that represents relation header header = relation.header # NOTE: each attribute is an object representing the name, type object, and possibly other constraints # can only modify relations that project all candidate keys for the base relations new_relation = relation.create(:a => 'test', :b => 'other', :c => 'yet another') new_relation = relation.update(:a => 'test', :b => 'other', :c => 'yet another') new_relation = relation.delete
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
-
Send me a pull request. Bonus points for topic branches.
Copyright © 2009 Dan Kubb. See LICENSE for details.