Skip to content
Tahir Ahmad Dar edited this page Nov 2, 2015 · 3 revisions

Introduction

Storing large graphs on disk is cumbersome process and difficult to manage. Modelling them as tables on a relational database requires a constant schema translation as nodes and edges are not the first class entities. Agama tries to provide a simple and intuitive way to store large graphs onto disks. Agama uses a key-value store like Tokyo Cabinet to serialize graphs internally.

Installing the Gem

As Agama currently uses Tokyo Cabinet to store the data, make sure http://fallabs.com/tokyocabinet/ and its http://fallabs.com/tokyocabinet/rubypkg/ are installed and are working on the target machine.

Ubuntu users can install Tokyo Cabinet through apt-get.

$ sudo apt-get install libtokyocabinet8 libtokyocabinet-ruby1.8

If above command does not work, please follow the steps below to install tykyocabinet.

1) open: http://fallabs.com/tokyocabinet/rubypkg/

2) download latest package of tykyocabinet for ruby.

3) Extract tar.gz folder.

4) From terminal reach to the extracted folder with cd command.

5) type from terminal:

ruby extconf.rb

make

su

make install

hence, Tokyocabinet is installed .

Once Tokyo Cabinet is installed, install the gem

$ gem install agama

Note: Agama is currently tested only on Linux.

A Simple Example

An example graph

In the above graph, there are three types of nodes viz., triangle, square and circle, and two types of edges line and dotted. Each node has an associated property called importance, the number in blue next to a node and each edge has an associated property weight, the number shown in red. There can be an arbitrary number of properties defined for a node or an edge and the properties can be different for different nodes.

Inserting Data

To insert a graph into Agama, we create a new graph and insert nodes and edges. Before inserting an edge both its adjacent nodes must be present in the graph.

require 'rubygems'
require 'agama'
require 'agama/adapters/tokyocabinet'

graph = Agama::Graph.new(:path => "path/to/the/datastore",
                         :db   => Agama::Adapters::TC.new)
graph.open

Nodes: A node in the graph is composed of two compulsory values type and name (strings) and zero or more other values (any ruby object). If the graph is homogenous and if the type of a node does not make much sense then it can be ignored and internally it will be set to '_' indicating undefined type.

So to insert a node:

node_1 = graph.set_node(:name => 'a',          #Compulsory property
                        :type => 'triangle',   #Compulsory property, defaults to '_'  
                        :importance => 1)      #User defined property

Note: A node is uniquely identified by its name + type. So it is valid to have two nodes with the same name but having different types.

Edges: There are two kinds of edges -- directed and undirected. So to add an edge we need to specify the starting node (from), the ending node (to), the type of the edge, which again defaults to '_' indicating undefined type and whether the edge is directed or not.

So let us add two other nodes before we add edges.

node_2 = graph.set_node(:name => 'b',
                        :type => 'square',
                        :importance => 6)
node_3 = graph.set_node(:name => 'k',
                        :type => 'circle',
                        :importance => 5)

To add an undirected edge:

edge_1 = graph.set_edge(:from => node_1,      #Compulsory properties (from and to) 
                        :to => node_3,        #Order of from and to does not matter for undirected edges
                        :type => 'line',      #Defaults to '_'
                        :directed => false,   #Compulsory property for the first edge of that type
                        :weight => 4)         #User defined property

We set the directed property to true for directed edges.

edge_2 = graph.set_edge(:from => node_2,
                        :to => node_1,
                        :type => 'dotted',
                        :directed => true,
                        :weight => 4)

Note: If we add an edge of a particular type as directed once and as undirected the next time around the system will raise an error and will not add the edge.

After adding all the nodes and edges of our graph in a similar fashion we close the graph.

graph.close

Querying Data

We first start by creating an Agama object to open our datastore.

require 'rubygems'
require 'agama'
require 'agama/adapters/tokyocabinet'

graph = Agama::Graph.new(:path => "path/to/the/datastore"
                         :db   => Agama::Adapters::TC.new)
graph.open

Let us fetch a couple of nodes -- i and e -- from our sample graph.

node_1 = graph.get_node(:name => 'i', 
                        :type => 'triangle')
puts node_1[:importance]  #Should print 8

node_2 = graph.get_node(:name => 'e', 
                        :type => 'circle')
puts node_2[:importance]  #Should print 4

Now to fetch the edge between these two nodes.

edge = graph.get_edge(:from => node_1, 
                      :to => node_2, 
                      :type => 'line')
puts edge[:weight]  #Should print 6

Traversing the graph: We can also fetch a list of all the neighbours of a node.

graph.neighbours(node_1).each do |edge|
  p edge  #Should print the entire structure of the edge
end

There are several ways to filter the neighbours.

Filter by edge type:

graph.neighbours(node_1).along('dotted').each do |edge|
  p edge
end

Filter by edge type + direction of the edge:

graph.neighbours(node_1).along('dotted').outgoing.each do |edge|
  p edge
end

Filter by edge type + direction of the edge + the node type of the neighbour:

graph.neighbours(node_1).along('dotted').outgoing.of_type('circle').each do |edge|
  p edge
end

Just make sure the graph is closed before the end of the program.

graph.close

Architecture

Node Structure

Node Structure

The nodes of the graph are stored on the disk in a hash store (currently Tokyo Cabinet). Every node in the graph can be thought of as a collection of properties. The type property (which defaults to "_" if not mentioned) and the name property together form the key of a node. The type and name properties are delimited and converted to a string, which is used as the key for placing the node in a hash store. The rest of the properties constitute the value.

Edge Structure

Edge Structure

The edges of the graph are stored on the disk in a B+ tree store (currently Tokyo Cabinet). Every edge in the graph can again be thought of as a collection of properties. The key of the from node, the key of the to node, the edge type property (which defaults to "_" if not mentioned) and the direction property together constitute the key of an edge in the order shown above. These properties are delimited and converted to a string, which is used as the key for placing the edge in the store. The rest of the properties constitute the value.

Every edge is added in two directions. For example, if there is an undirected edge between node a and node b then an edge from a to b with direction "N" (uNdirected) is added and an edge from b to a with direction "N" is added. Similarly, if there is a directed edge from node a to node b then an edge from a to b with direction "O" (Outgoing) is added and an edge from b to a with direction "I" (Incoming) is added.

To Do

  • More Adapters: Kyoto Cabinet, Berkeley Db, Tokyo Tyrant.
  • Value is currently serialized using Ruby Marshal Class. This could be changed to JSON to make the data storage independent of the language.