harrisj edited this page Sep 12, 2010 · 5 revisions
Clone this wiki locally

The DBSlayer ActiveRecord Adapter

Welcome to the wiki for the DBSlayer ActiveRecord adapter: an adapter for ActiveRecord to talk to the DBSlayer MySQL proxying backend developed by the New York Times. DBslayer is a new product, as is this adapter, so you might encounter a few bugs or crappy coding practices on my part, but it seems to work pretty well so far.

What is DBSlayer?

DBSlayer is a lightweight connection pooling layer built from open-source components that speaks the protocols of the web (HTTP and JSON) rather than some funky proprietary binary format. Since it is built from web components, it is designed to scale and be shared in a web way. DBSlayer is written in C and uses the libapr and libmysqlclient libraries and not much else (only 2KLOC!). For more details on DBSlayer including the source repository, visit the official product page at http://code.nytimes.com/projects/dbslayer

What is the DBSlayer ActiveRecord Adapter?

The DBSlayer ActiveRecord adapter is just an adapter for talking to DBslayer. I could claim it’s an amazing bit of code, but all I did really was subclass the existing MySQL adapter and modify it so that it connected to a DBSlayer instance over HTTP + JSON rather than using the underlying MySQL gem.

Why DBSlayer?

I’m not going to tell you that DBSlayer is a silver bullet for Internet scalibility, but it does have some distinctive features that might help in your architecture:

  • It uses a stateless HTTP layer for communication. This means any language that has a HTTP layer can interoperate with it, and you can also add Squid proxies, load balancers, etc. for kicks. This might help with scale.
  • It’s a central point for access to the database. This allows you to abstract away the database configuration from your backend applications, meaning you can reconfigure your databases without a cap deploy and also transparently swap in advanced techniques like sharding and master/slave databases at the DBSlayer layer.
  • The DBSlayer is the only access point that needs to be compiled with the libmysqlclient library as well. No more aggravations of compiling the mysql gem.
  • It also centralizes the database configuration in one place. If you have multiple web applications and loading scripts working with your database, going through the DBSlayer can help consolidate your configurations.

Some Caveats

Because of the stateless nature of the DBSlayer and the fact that it shares connections across multiple requests, there are certain stateful MySQL operations that aren’t supported by the DBslayer at all:

  • Disabling referential integrity
  • Transactions (while you could do a transaction with dbslayer, I haven’t figured it out yet with AR’s model)

The biggest problem is that Rails sets a connection variable for all MySQL connections in order to fix an error
selecting null IDs (http://dev.rubyonrails.org/ticket/6778). Since DBSlayer executes queries on pool of multiple backend connections to the database, this fix doesn’t work (it would be applied to one connection, but subsequent queries would be run on others), and
there unfortunately also doesn’t seem to be a database-specifc setting you can alter instead. As a result, the
following types of queries WILL return unexpected results when using the DBSlayer adapter:

Restaurant.find(:all, :conditions => ‘id IS NULL’)

The problem only occurs when searching for null autoincrement primary key columns (not other columns). With the regular
MySQL adapter, this would return records with a null id value (admittedly a weird case). With the DBSlayer adapter, it returns the
last inserted item into the table (this is the default MySQL behavior). Luckily, this is not a common idiom in Rails,
but you should be aware if you attempt to use it for finding such items in your tables.

Isn’t It A Single Point of Failure?

Yeah, you’ve got us there. If DBSlayer goes down, it will take out your entire database access for that application. But the same is often true of the database itself in most architectures, and it’s probably better for DBSlayer to take the fall rather than MySQL. In addition, DBSlayer is built on top of the same solid, scalable libraries as Apache (with not much else extra), so there’s not a lot of room for serious bugs. Still, why is it designed this way? We decided that the usefulness of having a central point for managing the database configurations as well as injecting logic for caching, security, transformations, etc. was far more useful than the risks of it being a single point of failure. Of course, the smart ones above may have realized you can deal with the issue by placing a load balancer in front… such things are easy when you have a service that speaks HTTP