Skip to content
nickh edited this page Feb 15, 2011 · 2 revisions

Dynashard Home

The Dynashard gem provides an easy way to configure ActiveRecord 3.x models to shard across multiple databases.

Background

ActiveRecord uses the model class to determine which database connection to use. By default, models will all use the connection from ActiveRecord::Base (generally the entry from config/databases.yml with the same name as the current environment) but you may override the default by establishing a connection in your model:

class Widget < ActiveRecord::Base
  establish_connection 'other_database'
end

This approach could be used for sharding by creating an entry per shard in config/database.yml and creating a subclass per shard:

class Widget < ActiveRecord::Base ; end

# A widget that lives on shard1
class Shard1Widget < Widget
  establish_connection 'shard1'
end

# A widget that lives on shard2
class Shard2Widget < Widget
  establish_connection 'shard2'
end

# ...

By knowing which shard you want to use, you can use the corresponding subclass to manage models on that shard:

@widget = Shard1Widget.find(:first)
@other_widget = Shard2Widget.create(:name => 'My awesome widget')

This approach can become complex if you want to shard models based on information in other models. For example, if you have an association between users and widgets and want to shard by user, you might do something like this:

class User < ActiveRecord::Base ; end

class Widget < ActiveRecord::Base ; end

# A widget that lives on shard1
class Shard1Widget < Widget
  establish_connection 'shard1'
  belongs_to :shard1_user
end

# A widget that lives on shard2
class Shard2Widget < Widget
  establish_connection 'shard2'
  belongs_to :shard2_user
end

# A user whose widgets live on shard1
class Shard1User < User
  has_many :shard1_widgets
end

# A user whose widgets live on shard2
class Shard2User < User
  has_many :shard2_widgets
end

# ...

This can be difficult to maintain as the number of shards grows, and requires a change to config/database.yml and an app restart whenever shards are added or removed.

How Dynashard Works

Dynashard uses a similar approach, but generates shard classes and sharded model classes dynamically. The first time a shard is used by a model, Dynashard will create classes something like this:

class Dynashard::Shard0
  establish_connection the_new_shard
end

class Dynashard::Shard0::Widget < Widget ; end

You never need to use the generated classes directly, but rather configure your models to be shard-aware and provide a shard context when using your model so that Dynashard can find the appropriate connection:

class Widget < ActiveRecord::Base
  shard :by => :user
end

In the above example, all database access for the Widget model requires that a sharding context for :user be specified. This can be done once to be used for all subsequent access, or can be done around a block:

# This context will be the same for any subsequent access.  All models configured to shard :by => :planet
# will use the 'earth' connection from config/databases.yml.
Dynashard.shard_context[:planet] = 'earth'
@leader = Leader.find(:first)

# This context will only be used for access inside a block.  All models configured to shard :by => :user
# will use the 'nickh' connection from config/databases.yml.
Dynashard.with_context(:user => 'nickh') do
  @widgets = Widget.find(:all)
end

Dynashard supports sharded associations - for example, in the sample models above the User determines which shard the Widget should use.

class User < ActiveRecord::Base
  shard :associated, :using => :shard_name
  has_many :widgets

  def shard_name
    # return the name of an entry from config/databases.yml
  end
end

class Widget
  shard :by => :user
  belongs_to :user
end

@user = User.find(:first)
@user.widgets.create(:name => 'This will be created on the database returned by @user.shard_name')

Shard context values can refer to entries in config/database.yml but may also be hashes that contain database connection parameters:

class User < ActiveRecord::Base
  shard :associated, :using => :shard_params

  def shard_params
    {
      :adapter  => 'sqlite3',
      :database => "db/#{self.name}.sqlite3"
    }
  end

In addition, values can be any object that responds to :call and returns either a reference to an entry in config/databases.yml or a hash of database connection parameters:

Clone this wiki locally