Skip to content

Commit

Permalink
First updates to the MongoDB comparison
Browse files Browse the repository at this point in the history
  • Loading branch information
syrio committed Sep 17, 2011
1 parent d0860a3 commit f053baa
Showing 1 changed file with 53 additions and 7 deletions.
60 changes: 53 additions & 7 deletions pages/Riak/Concepts/Comparisons/Riak-Compared-to-MongoDB.textile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
This is intended to be a brief, objective and technical comparison of Riak and MongoDB. (_This comparison is current as of the 1.6 release of MongoDB._)
This is intended to be a brief, objective and technical comparison of Riak and MongoDB. (_This comparison is current as of the 2.0 release of MongoDB._)


<div id="toc"></div>
Expand All @@ -25,23 +25,26 @@ Riak uses "consistent hashing" to replicate data. This functionality is deeply a
* [[Add Nodes to Riak|Basic Cluster Setup#Add-a-Second-Node-to-Your-Cluster]]
* [[Consistent Hashing|Riak Glossary#Consistent-Hashing]]

As of the 1.6 release, MongoDB offers several options for replication:
MongoDB offers several options for replication:

1. Master/Slave

[[http://www.mongodb.org/display/DOCS/Master+Slave]]

2. Replica sets

From the Mongo Docs: "Replica Sets are MongoDB's new method for replication. They are an elaboration on the existing master/slave replication, adding automatic failover and automatic recovery of member nodes."
From the Mongo Docs: "Replica Sets are MongoDB's new method for replication. They are an elaboration on the existing master/slave replication, adding automatic failover and automatic recovery of member nodes."

[[http://www.mongodb.org/display/DOCS/Replica+Sets]]

While Master/Slave replication is still supported, Replica sets adds auto failover so it's expected that most users will migrate to this configuration. However, in certain use cases traditional M/S is more appropriate and will still be supported.
Replica sets are a collection of MongoDB servers (nodes) that form a cluster. In every set there must be a primary node that processes all the writes and reads performed against that replica sets. Reads can perform against one of the set secondary nodes, but only if the client issuing the reads agreed that this is OK.
Different nodes that are part of a Replica Set can have priorities (specific priorities since v2.0) as to who will be voted (all of the set nodes participate in the vote) as the primary node if the current primary goes down. The voting process can take 10-30 seconds until the old primary is considered down and a new primary is elected by the others.
Tagging is a new feature in v2.0, and it allows the client to control where data should be written to, by using any tagging system, either tagging a certain piece of data with the actual IP address of the node that the client wants to use, or by using a general role string ("a fast server close to the app server") that will later be defined by the admin to be, say, a dedicated server in NYC.
Tagging also introduced the special Majority tag. Tagging writes with the majority tags enable a basic quorum like support for writes, allowing the client to ask that a write command will not return until the written data has propagated successfully into the majority of the nodes in a given Replica Set.

To enable horizontal scaling, Mongo uses a process known as "sharding," which involves designating certain server to hold certain chunks of the data as the data set grows.
While Master/Slave replication is still supported, Replica sets adds auto failover so it's expected that most users will migrate to this configuration. However, in certain use cases traditional M/S is more appropriate and will still be supported.

New in the 1.6 release is "auto-sharding."
To enable horizontal scaling, Mongo uses a process known as "sharding," which involves designating certain server to hold certain chunks of the data as the data set grows.

[[http://www.mongodb.org/display/DOCS/Sharding]]
[[http://www.mongodb.org/display/DOCS/Sharding+Introduction]]
Expand All @@ -55,6 +58,17 @@ MongoDB has support for removing shards from your database.

[[http://www.mongodb.org/display/DOCS/Configuring+Sharding#ConfiguringSharding-Removingashard]]


h2. Backups

In Riak, backups (hot and cold) can be performed per-node, or whole-cluster.
When using Bitcask (Riak default storage engine), you can perform a per-node backup simply by doing a filesystem backup of the Bitcask storage data directory. Restoring the node content is done by simply replacing the content of the data directory with the filesystem backup.
If you are using a different storage engine (such as Innostore) or would like to perform a whole-cluster backup, then this is done using the riak-admin tool.

MongoDB offers several ways to perform backups, including hot backups.
If journaling is enabled (v1.7.5+) it is possible to take a snapshot of the entire DB directory, while the DB is running. Other options are to use the Mongodump tool, use a dedicated replicated slave or perform a cold backup by shutting down or write locking the instance we want to backup.


h2. Performance

Riak has pluggable storage engines, with the recommended being "Bitcask":http://blog.basho.com/2010/04/27/hello-bitcask so you can tune levels of performance and durability based on your needs. (One thing that is said around the office: "Eventual consistency is no excuse for losing data.") Durability and performance can also be tuned at the request level by specifying the number of nodes that need to agree on reads and writes.
Expand Down Expand Up @@ -97,6 +111,31 @@ Mongo uses a "last one wins" technique for conflict resolution.

[[http://www.mongodb.org/display/DOCS/Atomic+Operations]]

h2. Compacting Support

Riak runs periodic merges on all the non-active data files to compact the stored data space. Since merges could be performance costly, especially in high-write scenario, periodic merges can be "windowed" to only run at specific times (perhaps when cluster load is low).

MongoDB compact command (v1.9+) will compact and defragment a specific collection, freeing up space in the DB as a result. The command blocks the DB, not allowing any other DB operation to run.
In addition, MongoDB Capped Collections provide the ability to store data auto expiring FIFO queue collection, making it possible to compact a collection to a well-known size by throwing replacing old records with new ones after the collection reaches a certain size limit. Capped collection storage can't fragment so they allow maximum utilization of storage space if configured appropriately.

h2. Data Corruption

When using Bitcask, written data is stored into an entry structure that contains the written data (key/value) and the CRC of that data. This provides protection against data corruption on the filesystem table.

MongoDB have no special mechanisms for detecting data corruption, but rely on the fact that in most corruptions, the underlying BSON structure that contain the data will be affected by the corruption and so the corruption will be noticed.

h2. Data Compression

Although Bitcask doesn't support compression, compression is available when using either LevelDB or Innostore as Riak storage.

As of v2.0, MongoDB doesn't support data compression.


h2. Geospatial Indexing

Riak is content-agnostic and doesn't utilize any specific query mechanisms aside from the newly introduced Secondary Indexes in the upcoming v1.0.

MongoDB can store location-aware documents, allowing the user to query for documents based on their exact location or their proximity to a given 2D point or to a specified territory defined by a polygon. Since version 2.0, documents can have multiple locations defined.

h2. API

Expand All @@ -105,6 +144,13 @@ Riak offers two primary interfaces to non-Erlang clients:
1. [[HTTP|HTTP API]]
2. [[Protocol Buffers|PBC API]]

MongoDB uses a custom protocol with BSON as the interchange format, and [[10gen|http://10gen.com/]] supports clients in the most popular programming languages.
MongoDB uses a custom protocol with BSON as the interchange format, and [[10gen|http://10gen.com/]] supports clients (drivers) in the most popular programming languages.

[[http://www.mongodb.org/display/DOCS/Mongo+Wire+Protocol]]


h2. Cloud Hosting

[[Canvas Hosting||http://canvashosting.com/]] provides a dedicated Riak hosting [[solution|http://canvashosting.com/solutions/riak/]].

[[MongoHQ|https://mongohq.com/home]], [[MongoLab|https://mongolab.com/home/]] and [[MongoMachine|https://www.mongomachine.com/]] offers different hosting solutions for MongoDB.

0 comments on commit f053baa

Please sign in to comment.