partitioning updates working #7584

wants to merge 1 commit into


None yet

6 participants


basic work for supporting partitioned tables in postgresql

these changes are associated with this pull request: #7573
which were changes associated with rails 3.2.8.

If I were to sum up the work, it would be:

  • provide an instance method arel_table used for any operation that has access to the instance should use the instance method to acquire an arel_table associated with the current models attributes.
  • alter class method arel_table to handle parameters, with no parameters do original work -- with parameters associate the table with the specific partitioned table determined by key attributes
  • provide methods to manage key attributes and values (these are the fields the db table is partitioned on)
  • provide an instance method table_name which calls the altered class method table_name which now takes attribute values that should determine the specific partitioned table to name
  • bunch of helper methods in postgresql connection area associated with schema management. this is for reasons 1) create_schema seems like a useful method, 2) adding foreign key is needed because in postgres child partitions need to manage the foreign key references, 3) some sequence method changes to support tables in non-public (well non search path) schema: this is probably generally useful work as rails seems broken about non-public schemas
  • change any self.class.arel_table to self.arel_table
  • create is a little weird because it needs to acquire the primary key if it isn't supplied (for instance ID where your need to fetch from the sequence -- for this work to be complete we need to supply the model instance method "prefetch_primary_key?" instead of it being on connection since prefetching isn't needed for any tables that aren't partitioned by a primary key)
  • some helper methods for finds (from_partition(*x)) which we've found useful in our day to day coding. this method just sets the table name (this is useful because find from the parent table even when partition keys are provided can take an inordinate time if the number of child tables is large -- so specifying the specific child table is useful).

the rest of the code to support partitioning is here: -- you'll need to pull from that branch (which doesn't try to patch rails -- so use it with this pull request). the master branch patches rails 3.2.8 correctly -- you can use it on your own. The current rubygem of partitioned patches rails in a different (and more conservative way) -- I don't think you should look at that code.

You could probably remove a bunch of stuff to make this code faster for the common non-partitioned case.

  • instance arel_table could just call class method arel_table
  • self.class.arel_table could just do the old work
  • instance table_name could just call class method table_name which did just the old work

then one might provide fixups for those methods for models where partitioning is desired.

I think the ugliest part of this code is update -- although I haven't walked down this path, it would seem the best way to manage this would be to add a hook to attribute modifications and fix up the all arel_tables that the attributes point to if the partitioned key values changes

I'm willing to help in any way that makes sense to support partitioning in a future rails version.


Thanks! I'll review this soon (leaving town today).


It's important to understand my pull isn't designed for you to accept directly into any release (I think @tenderlove didn't request anything from this pull other than something useful for understanding how I would provide partitioning in activerecord conceptually).

I believe this code is incomplete -- in fact, in my haste to present this pull (refactoring my work into something more digestible) I've found three issues associated with non-partitioned tables [related to non partitioned tables in a non public schema] that don't exist in my original work (the original work is in the monkey patch files in partitioned gem version 1.1.0 or earlier).

I believe the basic work is sound but I think deeper thought is worthy.

I can also conceive of solutions that color outside of the lines -- that aren't necessarily interesting to me directly (as this work solves my business needs) but may be more compelling to rails people.
Specifically: dynamically generating a model's class from partitioned keys as a row is fetched, that is Foo.first results in a single instance of a dynamically generated class named Foo::Partitioned42 associated with a table named foos_partitioned.p42 (i have completely thought this through -- but I think there is something worthy here)

I have other thoughts -- but I think @tenderlove should soak this work in.


Any update?


How is this going? Can I help push this along?

@frodsan frodsan referenced this pull request Oct 26, 2012

Partitioned #7573

gaurish commented Dec 30, 2012

Bumping this so @tenderlove or others might have chance to give feedback.

@tenderlove tenderlove commented on the diff Jan 15, 2013
@@ -73,6 +73,14 @@ def initialize(base)
@base = base
+ #
+ # Builds a SQL check constraint
+ #
+ # @param [String] constraint a SQL constraint
+ def check_constraint(constraint)
+ @columns <<"CHECK (#{constraint})")
tenderlove Jan 15, 2013 Ruby on Rails member

We should move this to ARel (I think). I see why we need this, but I don't like the implementation. ;-)

simonoff Jan 15, 2013

But it's a DB-aware setting. MySQL has different syntax for partitioning. I think it can be a stub in base adapter with different implementations for each DB.

tenderlove Jan 15, 2013 Ruby on Rails member

I'm specifically complaining about the Struct thing. We should add a node to ARel that handles CHECK

@tenderlove tenderlove commented on the diff Jan 15, 2013
@@ -133,8 +133,25 @@ def ===(object)
# class Post < ActiveRecord::Base
# scope :published_and_commented, published.and(self.arel_table[:comments_count].gt(0))
# end
- def arel_table
- @arel_table ||=, arel_engine)
+ def arel_table(arel_attribute_values = {})
+ @arel_tables ||= {}
+ if arel_attribute_values.blank?
tenderlove Jan 15, 2013 Ruby on Rails member

Change this to empty?


I've merged master in to this branch and pushed a copy to my fork of Rails.

(I'm going to write some stuff here, and I may be off base, so please correct me or confirm my comments!)

From what I can gather, one of the main goals of this pull request is to make the value that arel_table returns dynamic. This is because we need to somehow calculate the table name for the partitioned table?

I am in favor of making the return value of arel_table dynamic, but I don't think passing in the attribute values, then calling partition_keys is the right way to go. It seems to me that we have two different strategies for calculating the correct table:

  1. Just the table name derived from the class (the current behavior)
  2. Calculate the table name based on partition keys

Rather than passing stuff to arel_table, maybe we should split out these two objects and configure the class to use the particular strategy. e.g.:

class User < AR::Base
  partitioned [:name] # Configures the class to be partitioned and teaches the class about it's keys

Does this seem reasonable?


This sounds reasonable given information I probably don't have.

arel_table is fussed with for cases where the target table in the sql statement is acquired from attributes.
(i believe the update case does this by reaching into the first attribute that is to be modified and acquiring its arel_table and finding the table from THAT).

which I believe is here:

module Arel
  # FIXME hopefully we can remove this
  module Crud
    def compile_update values
      um = @engine

      if Nodes::SqlLiteral === values
        relation = @ctx.from
        relation = values.first.first.relation
      um.table relation
      um.set values
      um.take @ast.limit.expr if @ast.limit
      um.wheres = @ctx.wheres

Since there is no back reference to the model instance in the Arel::Table (only a reference to the model's class and the table's name at Arel::Table instance creation time) something must be done to get the information needed to calculate the table's name at the code "relation = values.first.first.relation" invocation time.

What is the best way to do that?


There are probably many ways to slice this bacon that I haven't considered. Marking attributes as part of the partition keys (and then forcing those to be passed to compile_update since they wouldn't normally be passed because they would generally not be attributes that are changed). This would have the benefit of supplying the partition keys in the where clause OR supplying the specific table name for the relation (for partition solutions that wish to use one or the other solution).

It also seems like @arel_table could simply be re-calculated whenever a partition key is modified. I'm investigating this solution (although haven't tried to implement it yet). I noticed the correct hook locations seem to be:


[edited note: I investigated re-calculating @arel_table in a #write_attribute hook. It works, but I see no win. It doesn't make the code cleaner and is just less obvious what is happening if anyone is trying to figure out how the code works.]


From what I can gather, one of the main goals of this pull request is to make the value that
arel_table returns dynamic. This is because we need to somehow calculate the table name
for the partitioned table?

Stepping back, we actually want a dynamic "model::table_name" which could be simply model#table_name and model::table_name(attributes = {}) and we want said method to be lazily evaluated such that the table name is only computed after all attributes (actually just the partition key attributes) have been set before sql statements for insert/update/delete are constructed.

It just turns out we need to do this through the arel_table because THAT is the context statement construction is given to work with to compute the table name.

For select statement construction we need something else to determine the partition because the key values may not be specified (although we could ignore this case since "select * from foos" should visit all relevant partitions). I found a very simple way of fast tracking the specific table -- I use a scope like thing: from_partition(x) where 'x' are the partition key values, so:


This is described in a different way, in the README for the partitioned gem

[brain fart edited out]


(i've been reviewing my patches to ensure I respond to your questions in the most accurate way possible)

with respect to the issue with passing attributes to dynamic_arel_table. This is done for exactly one case, constructing an INSERT statement, here:

   class Relation
     # Patches {ActiveRecord}'s building of an insert statement to request
     # of the model a table name with respect to attribute values being
     # inserted.
     # The differences between this and the original code are small and marked
     # with PARTITIONED comment.
     def insert(values)
       primary_key_value = nil

       if primary_key && Hash === values
         primary_key_value = values[values.keys.find { |k|
  == primary_key

         if !primary_key_value && connection.prefetch_primary_key?(klass.table_name)
           primary_key_value = connection.next_sequence_value(klass.sequence_name)
           values[klass.arel_table[klass.primary_key]] = primary_key_value

       im = arel.create_insert
       # PARTITIONED ADDITION. get arel_table from class with respect to the
       # current values to placed in the table (which hopefully hold the values
       # that are used to determine the child table this insert should be
       # redirected to)
       actual_arel_table = @klass.dynamic_arel_table(Hash[*{|k,v| [,v]}.flatten]) if @klass.respond_to?(:dynamic_arel_table)
       actual_arel_table = @table unless actual_arel_table
       im.into actual_arel_table

       conn = @klass.connection

       substitutes = values.sort_by { |arel_attr,_| }
       binds       = do |arel_attr, value|
         [@klass.columns_hash[], value]

       substitutes.each_with_index do |tuple, i|
         tuple[1] = conn.substitute_at(binds[i][0], i)

       if values.empty? # empty insert
         im.values = Arel.sql(connection.empty_insert_statement_value)
         im.insert substitutes


the code (around: im.into actual_arel_table) has access to the model class and the Arel::Table instance as calculated by the model class. We need to pass an appropriate Arel::Table for the specific partition table to im.into, my choice is to pass attributes to arel_table and let it manage the caching all Arel::Tables that are generated for a specific model.

I originally considered (but this seemed intrusive for me to put in a patch) to refactor the call path to Relation#insert such that it had access to the model instance (so Relation can call back to the model instance to generate the Arel::Table for the specific partition table needed for this insert). I admit it never got past the consideration phase once I realized ::dynamic_arel_table could simply be passed all attribute values and calculate it when needed.


Another consideration is a "dynamic default scope" for lack of a better word.

Operations on a model instance, like #delete, #reload, and #update should not use class.unscoped, but rather allow the model to assist in scoping associated with partitioning.

the partitioned gem adds ::from_partition, a class method that resolves to model_class.from("#{partition_table_name} AS #{parent_table_name}")

This is required because the only attribute passed to delete is id. For partitioned tables the best results result from including all keys needed to specify the specific partition AND the primary key for the target row.

Here is delete, as patched by partitioned:

    def delete
      if persisted?
      @destroyed = true

It's my opinion ActiveRecord should provide some instance method that, by default, returns self.class.unscoped (i'm guessing that is the correct default) which methods such as #delete, #reload, and #update use and partitioned models can override to better target the specific partition.

For those reading and wondering why the following sql isn't sufficient:

delete from foos where id = 1

if a table is partitioned by created_at::date, the above sql will search all partitioned tables for the target row. if the sql is changed to:

delete from foos where id = 1 and created_at::date = '2010-08-05'

or better yet:

delete from foos_20100805 where id = 1

there is far less work for the database to find the target row.


another consideration, which I've only recently started looking at, is how to handle associations to partitioned tables.

Here is a convoluted example of a message model with attachments:

The messages table is partitioned by created_at.

class Message < ::Partitioned::ByCreatedAt
  attr_accessible :from_user_id, :to_user_id, :subject
  belongs_to :from_user, :class_name => 'User'
  belongs_to :to_user, :class_name => 'User'
  has_many :attachments, :class_name => 'Attachment', :conditions => lambda {|a| "attachments.created_at::date = '#{created_at.to_date}'" }

The attachments table is partitioned along with messages table so that foreign key references can be applied and keep referential integrity (this is an issue with how Postgres handles foreign keys TO a partitioned table -- there is no other option other than to partition both tables by the same values)

class Attachment < ByMessageCreatedAt
  attr_accessible :message_created_at, :message_id, :data
  belongs_to :message, :class_name => 'Message', :conditions => lambda {|a| "messages.created_at::date = '#{message_created_at.to_date}'" }

I've been handling these cases by using conditions, which is not perfect. It would be ideal if an association could call back to the model to request the scope to fetch the target with all the knowledge a model instance can provide for targeting.


I'm wondering if this has fallen by the wayside? Is there anything I can do to help it along?
If you'd like me to take another crack at the work, please speak up. To sum up what I believe needs to be done:

  • alter and write methods to fetch table name:
    • ::table_name(attributes = {}) and #table_name
  • alter and write methods to fetch arel_table:
    • ::arel_table(attributes = {}) and #arel_table
  • write Arel::Node for CHECK constraint
  • add TableDefinition::check_constraint(constraint_expression)
  • add method to fetch scope from model instance:
    • #scope (default returns model_class.unscoped)
  • update methods #delete, #reload, #update to use #scope
  • add method to assist association with scoping:
    • #association_scope(target_association) (default returns model_class.unscoped)
  • update code to use #association_scope

@keithgabryelski @tenderlove Is this still being worked on or is it abandoned now? Since it's been over 2 years since it's been commented on, perhaps it should be closed? Or can something be done to move this along?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment