Skip to content
Swen Kooij edited this page Dec 1, 2013 · 4 revisions

We all know one of the largest companies in the World: 'Facebook'. Facebook is one of pioneers in the field of software development. They always come up with interesting ideas to power their huge social network. Luckily for us, software developers they also release a lot of their stuff as open source. Which of course means that anyone can checkout the source code, or contribute.

This time, they came up with something called 'RocksDB'. According to their own words, RocksDB is:

A persistent key-value store for flash storage

RocksDB is based on Google's LevelDB:

https://code.google.com/p/leveldb/

RocksDB is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has an Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor(WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor(SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

What is RocksDB PHP ?

The RocksDB library is a C++ library. RocksDB PHP is a PHP extension written in C and C++ to make the RocksDB library usable from PHP. This allows for fast key-value pair storage from PHP. The RocksDB PHP extension was NOT developed by Facebook.

I developed the RocksDB PHP extension because I like programming. That is/was the most prominent reason. I like programming and I like open source software. I just thought it would be beneficial to some people. Personally I have no use for it, at least not now. I just like developing it.

The goal? Make the the entire RocksDB library available from PHP.

Why should I use an embedded database instead of a database server like MySQL?

Of course there is nobody telling you MySQL, or any other relational database management system is bad and that embedded databases like RocksDB are better. They both have their advantages, it depends on what you're building, what you're storing and how much you are storing, and how much you care about fast data access.

The diagram shown above displays how traditional systems are designed. We have an application and a data source, which is accessible over the network. As you can see, we need some time to read the data from disk and transmit it over the network. This takes time, a lot of time if we're talking about huge bulks of data. Of course, the advantage is that multiple applications can use the same database, as long as they are able to reach the database server.

Now let's take a look at how an embedded database server would look like:

This is all logical. We have no network latency, so we have faster access to our data. There are however some disadvantages:

  • No network access, not distributed.
  • No fail over, if the machine dies, the data is lost.
  • No relational storage, just key-value pairs.

You might wondering at this point if RocksDB can even compete with, for example, MySQL. No, it can not. You simply cannot compare them. They both have different applications. Because RocksDB does persistent key-value storage, it can be used for search caches (which is what Facebook is using it for). So what are the advantages?

  • Extremely fast reading
  • Extremely fast writing
  • Fast look ups (key-value pairs)
  • Can handle large volumes of data

It all comes down to what you need. Read some of the other Wiki pages for more information.