A library for composing complex SQL queries by defining their subcomponents and the dependencies between them.
Ruby
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

Query::Composer

Simple SQL queries are, well, simple. But when you start needing to deal with nested subqueries, and especially when those nested subqueries themselves require nested subqueries...things start getting difficult to manage.

Query::Composer was extracted from a real application, where reporting queries were dynamically generated and typically exceeded 50KB of text for the query alone!

This library allows you to specify each component of query independently, as well as allowing you to indicate which other components each component depends on. The composer will then build the correct query from those components, on demand.

Features

  • Define your queries in terms of components, each of which is more easily tested and debugged
  • A dependency-resolution system for determining the proper ordering of query subcomponents within a complex query
  • A simple class (Query::Base) for more conveniently defining queries using Arel
  • The ability to generate the same query using either derived tables (nested subqueries), or CTEs (Common Table Expressions)

Usage

First, instantiate a composer object:

require 'query/composer'

composer = Query::Composer.new

Then, declare the components of your query with the #use method:

composer.use(:patrons) { Patron.all }

Declare dependent components by providing parameters to the block that are named the same as the components that should be depended on:

# `patrons` must exist as another component in the composer...
composer.use(:books) { |patrons| ... }

Component definitions must return an object that responds to either #arel, or #to_sql:

# ActiveRecord scopes respond to #arel
composer.use(:patrons) { Patron.all }

require 'query/base'

# Arel objects and Query::Base (a thin wrapper around
# Arel::SelectManager) respond to #to_sql
composer.use(:books_by_patron) do |patrons|
  books = Book.arel_table
  lendings = Lending.arel_table

  Query::Base.new(books).
    project(patrons[:first_name], books[:name]).
    join(lendings).
      on(lendings[:book_id].eq(books[:id])).
    join(patrons).
      on(patrons[:id].eq(lendings[:patron_id]))
end

Generate the query by calling #build on the composer, and telling it which component will be the root of the query:

# Builds the query using the books_by_patron component as the root.
query = composer.build(:books_by_patron)
# SELECT "patrons"."first_name", "books"."name"
# FROM "books"
# INNER JOIN "lendings"
# ON "lendings"."book_id" = "books"."id"
# INNER JOIN (
#   SELECT "patrons".* FROM "patrons"
# ) "patrons"
# ON "patrons"."id" = "lendings"."patron_id"

# Builds the query using the patrons component as the root
query = composer.build(:patrons)
# SELECT "patrons".* FROM "patrons"

Run the query by converting it to SQL and executing it:

sql = query.to_sql

# using raw ActiveRecord connection
rows = ActiveRecord::Base.connection.execute(sql)

# using ActiveRecord models
rows = Book.find_by_sql(sql)

Example

Let's use a library system as an example. (See this full example in examples/library.rb.) We'll imagine that there is some administrative interface where users can generate reports. One report in particular is used to show:

  • All patrons from a specified set of libraries,
  • Who have checked out books this month,
  • From a specified set of topics,
  • And compare that with the same period of the previous month.

We will assume that we have a data model consisting of libraries, topics, books, patrons, and lendings, where books belong to libraries and topics, and lendings relate patrons to books, and include the date the lending was created.

First, instantiate a composer object:

require 'query/composer'
require 'query/base'

composer = Query::Composer.new

We'll assume we have some object that describes the parameters for the query, as given by the user:

today = Date.today

config.current_period_from = today.beginning_of_month
config.current_period_to   = today
config.prior_period_from   = today.last_month.beginning_of_month
config.prior_period_to     = today.last_month

config.library_ids         = [ ... ]
config.topic_ids           = [ ... ]

Then, we tell the composer about the components of our query:

# The set of libraries specified by the user
composer.use(:libraries_set) { Library.where(id: config.library_ids) }

# The set of topics specified by the user
composer.use(:topics_set) { Topic.where(id: config.topic_ids) }

# The set of patrons to consider (all of them, here)
composer.use(:patrons_set) { Patron.all }

# The set of books to consider (all those from the given libraries
# with the given topics)
composer.use(:books_set) do |libraries_set, topics_set|
  books = Book.arel_table

  Query::Base.new(books).
    project(books[:id]).
    join(libraries_set).
      on(books[:library_id].eq(libraries_set[:id])).
    join(topics_set).
      on(books[:topic_id].eq(topics_set[:id]))
end

Note the use of the parameters in the block for books_set. The names for the parameters are explicitly chosen here to match the names of other query components. Query::Composer uses these names to determine which components a component depends on--in this case, books_set depends on both libraries_set and topics_set.

We still need to tell the composer how to find the lendings. Because we'll need the same query with two different date spans (one for the "current" period, and one for the "prior" period), we'll create a helper method:

# books_set -- the set of books to be considered
# from_date -- the beginning of the period to consider
# to_date -- the end of the period to consider
def lendings_set(books_set, from_date, to_date)
  lendings = Lending.arel_table

  patron_id = lendings[:patron_id]
  count = patron_id.count.as("total")

  Query::Base.new(lendings).
    project(patron_id, count).
    join(books_set).
      on(lendings[:book_id].eq(books_set[:id])).
    where(lendings[:created_at].between(from_date..to_date)).
    group(patron_id)
end

This lendings set will be all patron ids who borrowed any of the books in the given set, between the given dates, and will include how many books were borrowed by each patron during that period.

With that, we can now finish defining our query components:

# Books in the "current" set
composer.use(:current_set) do |books_set|
  lendings_set(books_set,
    config.current_period_from,
    config.current_period_to)
end

composer.use(:prior_set) do |books_set|
  lendings_set(books_set,
    config.prior_period_from,
    config.prior_period_to)
end

# Joins the current_set and prior_set to the patrons_set
composer.use(:combined_set) do |patrons_set, current_set, prior_set|
  Query::Base.new(patrons_set).
    project(patrons_set[Arel.star],
            current_set[:total].as("current_total"),
            prior_set[:total].as("prior_total")).
    join(current_set).
      on(current_set[:patron_id].eq(patrons_set[:id])).
    join(prior_set, Arel::Nodes::OuterJoin).
      on(prior_set[:patron_id].eq(patrons_set[:id]))
end

There--our query is defined. Now we just need to tell the composer to generate the SQL. Once we have the SQL, we can use it to query the database:

sql = composer.build(:combined_set).to_sql

Patron.find_by_sql(sql).each do |patron|
  puts "#{patron.name} :: #{patron.current_total} :: #{patron.prior_total}"
end

The generated query, assuming a current month of Feb 2016, might look like this (formatted for readability):

SELECT a.*,
       e."total" AS current_total,
       f."total" AS prior_total
FROM (
  SELECT "patrons".*
  FROM "patrons"
) a
INNER JOIN (
  SELECT "lendings"."patron_id",
         COUNT("lendings"."patron_id") AS total
  FROM "lendings"
  INNER JOIN (
    SELECT "books"."id"
    FROM "books"
    INNER JOIN (
      SELECT "libraries".*
      FROM "libraries"
      WHERE "libraries"."id" IN (1, 2)
    ) b
    ON "books"."library_id" = b."id"
    INNER JOIN (
      SELECT "topics".*
      FROM "topics"
      WHERE "topics"."id" IN (1, 2, 3, 4)
    ) c
    ON "books"."topic_id" = c."id"
  ) d
  ON "lendings"."book_id" = d."id"
  WHERE "lendings"."created_at" BETWEEN '2016-02-01' AND '2016-02-15'
  GROUP BY "lendings"."patron_id"
) e
ON e."patron_id" = a."id"
LEFT OUTER JOIN (
  SELECT "lendings"."patron_id",
         COUNT("lendings"."patron_id") AS total
  FROM "lendings"
  INNER JOIN (
    SELECT "books"."id"
    FROM "books"
    INNER JOIN (
      SELECT "libraries".*
      FROM "libraries"
      WHERE "libraries"."id" IN (1, 2)
    ) b
    ON "books"."library_id" = b."id"
    INNER JOIN (
      SELECT "topics".*
      FROM "topics"
      WHERE "topics"."id" IN (1, 2, 3, 4)
    ) c
    ON "books"."topic_id" = c."id"
  ) d
  ON "lendings"."book_id" = d."id"
  WHERE "lendings"."created_at" BETWEEN '2016-01-01' AND '2016-01-15'
  GROUP BY "lendings"."patron_id"
) f
ON f."patron_id" = a."id"

For databases that support Common Table Expressions (CTE, or "with" queries), you can pass use_cte: true to the composer#build method to have the composer generate a CTE query instead. (NOTE that CTE queries can be very inefficient in some DBMS's, like PostgreSQL!)

sql = composer.build(:combined_set, use_cte: true)

The CTE query looks like this:

WITH
  "a" AS (
    SELECT "patrons".* FROM "patrons"),
  "b" AS (
    SELECT "libraries".*
    FROM "libraries"
    WHERE "libraries"."id" IN (1, 2)),
  "c" AS (
    SELECT "topics".*
    FROM "topics"
    WHERE "topics"."id" IN (1, 2, 3, 4)),
  "d" AS (
    SELECT "books"."id"
    FROM "books"
    INNER JOIN "b"
      ON "books"."library_id" = "b"."id"
    INNER JOIN "c"
      ON "books"."topic_id" = "c"."id"),
  "e" AS (
    SELECT "lendings"."patron_id",
           COUNT("lendings"."patron_id") AS total
    FROM "lendings"
    INNER JOIN "d"
      ON "lendings"."book_id" = "d"."id"
    WHERE "lendings"."created_at" BETWEEN '2016-02-01' AND '2016-02-15'
    GROUP BY "lendings"."patron_id"),
  "f" AS (
    SELECT "lendings"."patron_id",
           COUNT("lendings"."patron_id") AS total
    FROM "lendings"
    INNER JOIN "d" ON "lendings"."book_id" = "d"."id"
    WHERE "lendings"."created_at" BETWEEN '2016-01-01' AND '2016-01-15'
    GROUP BY "lendings"."patron_id")
SELECT "a".*,
       "e"."total" AS current_total,
       "f"."total" AS prior_total
FROM "a"
INNER JOIN "e"
ON "e"."patron_id" = "a"."id"
LEFT OUTER JOIN "f"
ON "f"."patron_id" = "a"."id"

Also, to make it easier to debug queries, you can also pass use_aliases: false to composer#build in order to make the composer use the full component names, instead of shorter aliases.

sql = composer.build(:combined_set, use_aliases: false)

The resulting query:

SELECT patrons_set.*,
       current_set."total" AS current_total,
       prior_set."total" AS prior_total
FROM (
  SELECT "patrons".*
  FROM "patrons"
) patrons_set
INNER JOIN (
  SELECT "lendings"."patron_id",
         COUNT("lendings"."patron_id") AS total
  FROM "lendings"
  INNER JOIN (
    SELECT "books"."id"
    FROM "books"
    INNER JOIN (
      SELECT "libraries".*
      FROM "libraries"
      WHERE "libraries"."id" IN (1, 2)
    ) libraries_set
    ON "books"."library_id" = libraries_set."id"
    INNER JOIN (
      SELECT "topics".*
      FROM "topics"
      WHERE "topics"."id" IN (1, 2, 3, 4)
    ) topics_set
    ON "books"."topic_id" = topics_set."id"
  ) books_set
  ON "lendings"."book_id" = books_set."id"
  WHERE "lendings"."created_at" BETWEEN '2016-02-01' AND '2016-02-15'
  GROUP BY "lendings"."patron_id"
) current_set
ON current_set."patron_id" = patrons_set."id"
LEFT OUTER JOIN (
  SELECT "lendings"."patron_id",
         COUNT("lendings"."patron_id") AS total
  FROM "lendings"
  INNER JOIN (
    SELECT "books"."id"
    FROM "books"
    INNER JOIN (
      SELECT "libraries".*
      FROM "libraries"
      WHERE "libraries"."id" IN (1, 2)
    ) libraries_set
    ON "books"."library_id" = libraries_set."id"
    INNER JOIN (
      SELECT "topics".*
      FROM "topics"
      WHERE "topics"."id" IN (1, 2, 3, 4)
    ) topics_set
    ON "books"."topic_id" = topics_set."id"
  ) books_set
  ON "lendings"."book_id" = books_set."id"
  WHERE "lendings"."created_at" BETWEEN '2016-01-01' AND '2016-01-15'
  GROUP BY "lendings"."patron_id"
) prior_set
ON prior_set."patron_id" = patrons_set."id"

License

Query::Composer is distributed under the MIT license. (See the LICENSE file for more information.)

Author

Query::Composer is written and maintained by Jamis Buck jamis@jamisbuck.org. Many thanks to T2 Modus for permitting this code to be released as open source!