Skip to content
Efficient tree structures for ActiveRecord
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



Arboreal is yet another extension to ActiveRecord to support tree-shaped data structures.

Arboreal surfaces relationships within the tree like children, ancestors, descendants, and siblings as scopes, so that additional filtering/pagination can be performed.

It delegates as much work as possible to the underlying DBMS, making it efficient to:

  • fetch all ancestors, descendants or siblings of a node

  • move nodes (or subtrees) around

  • prevent loops

  • rebuild the hierarchy

Getting started

First, install the “arboreal” gem, and add it to your Rails project's config/environment.rb.

Next, you'll need a migration to add parent_id and ancestry_string columns, and indices:

class MakeThingsArboreal < ActiveRecord::Migration

  def self.up
    add_column "things", "parent_id", :integer
    add_index "things", ["parent_id"]
    add_column "things", "ancestry_string", :string
    add_index "things", ["ancestry_string"]

  def self.down
    remove_index "things", ["ancestry_string"]
    remove_column "things", "ancestry_string"
    remove_index "things", ["parent_id"]
    remove_column "things", "parent_id"


Finally, you can declare your model arboreal:

class Thing < ActiveRecord::Base


  # .. etc etc ...


Navigating the tree

Arboreal adds the basic relationships you'd expect:

  • parent

  • children

In addition, it provides the following handy methods on each tree-node:

  • ancestors

  • descendants

  • subtree (the node itself, plus descendants)

  • siblings

  • root (the topmost ancestor)

The first four return scopes, to which additional filtering, ordering or limits may be applied.

At the class-level:

  • roots is a named-scope returning all the nodes without parents

  • rebuild_ancestry rebuilds the ancestry cache, as described below

Rebuilding the ancestry cache

Internally, Arboreal uses the ancestry_string column to cache the path down the tree to each node (or more correctly, it's parent. This technique - a variant of “path enumeration” or “materialized paths” - allows efficient retrieval of both ancestors and descendants.

It's conceivable that the computed ancestry-string values may get out of whack, particularly if changes are made directly to the database. If you suspect corruption, you can restore sanity using rebuild_ancestry, e.g


The ancestry rebuild is implemented in SQL to leverage the underlying DBMS, and so is pretty efficient, even on large trees.

You can’t perform that action at this time.