Titan is designed to support the processing of graphs so large that they require storage and computational capacities beyond what a single machine can provide. This is Titan’s foundational benefit. This section will discuss the various specific benefits of Titan and its underlying, supported persistence solutions.
When using a database, the CAP theorem should be thoroughly considered (C=Consistency, A=Availability, P=Partitionability). Titan is distributed with 3 supporting backends: Cassandra, HBase, and BerkeleyDB. Their tradeoffs with respect to the CAP theorem are represented in the diagram below. Note that BerkeleyDB is a non-distributed database and as such, is typically only used with Titan for testing and exploration purposes.
“Despite your best efforts, your system will experience enough faults that it will have to make a choice between reducing yield (i.e., stop answering requests) and reducing harvest (i.e., giving answers based on incomplete data). This decision should be based on business requirements.” – Coda Hale
HBase gives preference to consistency at the expense of yield, i.e. the probability of completing a request. Cassandra gives preference to availability at the expense of harvest, i.e. the completeness of the answer to the query (data available/complete data).
Through the Fulgora extension Titan can also be used as an in-memory graph database for low latency applications. The Fulgora extension provides support for in-memory storage backends and currently supports the Hazelcast distributed in-memory data grid.
Last edited by Matthias Broecheler,