Simple model layer that let's you query text documents as if they were a database.
Switch branches/tags
Nothing to show
Pull request Compare This branch is 28 commits behind ralph:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Document Mapper

Document mapper is an object mapper for plain text documents. The documents look like the ones used in jekyll, toto or Serious. They consist of a preambel written in YAML (also called YAML front matter), and some content in the format you prefer, e.g. Textile. This enables you to write documents in your favorite editor and access the content and metadata in your Ruby scripts.

Step-by-step tutorial

Documents look somehow like this. The part between the ---s is the YAML front matter. After the second ---, there is one blank line, followed by the content of the file. All items in the YAML front matter and the content are accessible by Document Mapper.

id: 1
title: Ruby is great
tags: [programming, software]
number_of_foos: 42
status: published

I like Ruby.

In order to access the values in the front matter, you have to create a class that includes DocumentMapper.

require 'document_mapper'
class MyDocument
  include DocumentMapper::Document

Initializing single documents

doc = MyDocument.from_file('./documents/document-file.textile')

Accessing the attributes of single documents

doc.title                    # => "Ruby is great"
doc.tags                     # => ["programming", "software"]
doc.content                  # => "I like Ruby."

Date recognition

You can either set the date of a document in the YAML front matter, or you can use the file name, if you want to. A file named 2010-08-07-test-document-file.textile will return a date like this:                     # => #<Date: 2010-08-08 (4910833/2,0,2299161)>                # => "2010-08-08"
doc.year                     # => 2010
doc.month                    # => 08                      # => 07

Working with directories

As an example let’s assume we have a directory called “documents” containing the following files:


In order to work with a whole directory of files, we have to use the directory method:

require 'document_mapper'
class MyDocument
  include DocumentMapper::Document = 'documents'

Now we can receive all available documents or filter like that:

MyDocument.where(:title => 'Some title').first
MyDocument.where(:status => 'published').all
MyDocument.where(:year => 2010).all

Not all of the documents in the directory need to have all of the attributes. You can add single attributes to single documents, and the queries will only return those documents where the attributes match.

The document queries do support more operators than just equality. The following operators are available:

MyDocument.where( => 2010)        # year > 2010
MyDocument.where(:year.gte => 2010)       # year >= 2010
MyDocument.where( => [2010,2011]) # year one of [2010,2011]
MyDocument.where(:tags.include => 'ruby') # 'ruby' is included in tags = ['ruby', 'rails', ...]
MyDocument.where( => 2010)        # year < 2010
MyDocument.where(:year.lte => 2010)       # year <= 2010

While retrieving documents, you can also define the way the documents should be ordered. By default, the documents will be returned in the order they were loaded from the file system, which usually means by file name ascending. If you define an ordering, the documents that don’t own the ordering attribute will be excluded.

MyDocument.order_by(:title => :asc).all  # Order by title attribute, ascending
MyDocument.order_by(:title).all          # Same as order_by(:title => :asc)
MyDocument.order_by(:title => :desc).all # Order by title attribute, descending


Chaining works with all available query methods, e.g.:

MyDocument.where(:status => 'published').where(:title => 'Some title').limit(2).all


If any of the files change, you must manually reload them:



Written by Ralph von der Heyden. Don’t hesitate to contact me if you have any further questions.

Follow me on Twitter!