ANSI SQL2003 window functions #103

Merged
merged 3 commits into from Feb 23, 2012

3 participants

@atombender

It was necessary to modify Arel to support window functions.

Example of a simple window function

table = Arel::Table.new('users')
table.project(['date', table[:id].count.over])

Generates:

SELECT "users"."date", COUNT("users"."id") OVER ()
FROM "users"

Window function with clauses

table = Arel::Table.new('users')
window = Arel::Nodes::Window.new.
  order(t['date'].asc).
  rows(Arel::Nodes::Preceding.new)
table.project(['date', table[:id].count.over(window)])

Generates:

SELECT "users"."date", COUNT("users"."id") OVER (
  ORDER BY "users"."date" ASC
  ROWS UNBOUNDED PRECEDING
)
FROM "users"

Declaring named windows

table = Arel::Table.new('users')
expr = table.project(['date', table[:id].count.over('mywindow')])
expr.window('mywindow').
  order(t['date'].asc).
  range(Arel::Nodes::Preceding.new(10))

Generates:

SELECT "users"."date", COUNT("users"."id") OVER "mywindow"
FROM "users"
WINDOW "mywindow" AS (
  ORDER BY "users"."date" ASC
  RANGE 10 PRECEDING
)

Caveats

  • PostgreSQL is the only database that I know (aside from Oracle) that implements window functions. So I have focused on syntax that is compatible with Postgres.

  • I have not implemented PARTITION BY yet, but I will be working on that.

  • I have tried to be careful about fitting this into Arel's node structure (with regard to things like arity), but the AST is somewhat inconsistently defined, and it's possible I have made a bunch of mistakes. For example, I have defined ROWS as a unary operator (as it takes an expression: ROWS UNBOUNDED PRECEDING, say), but it's debatable whether this is correctly. Similarly, the framing clause <n> PRECEDING is also implemented as unary, even though the value comes first. Also, even though OVER is a unary operator, OVER () is a valid clause, which is implemented by giving it nil as its right-hand side.

  • The AST-building verbs are window on SelectStatement, and over on functions. To write proper window definitions, one will have to build the AST manually. I would like to add some factory methods for building the various clauses.

@tenderlove
Ruby on Rails member

This seems good. If you'd like to clean up any AST inconsistencies you find, I'd really appreciate it. I was following the productions from the sqlite3 grammar, but I think it would be better long term if we switch to SQL99.

I'll merge this in. Thanks for the work!

@atombender

Cool!

As for inconsistencies: Well, I had a whole bunch of things in mind when I was hacking the code last night, but now I can't recall all of them. Some that come to mind at the moment:

  • Aliases are baked into Function. Aliases deserve to be conceptually an AST node, ie. new Alias(left, "some_name").

  • Related to that, the factory methods in expressions.rb actually specify hard-coded aliases.

  • Functions are not chainable. That may be a bug. You can't do table[:foo].count.sum to get SUM(COUNT(foo)).

  • Similarly, it should be possible to treat subqueries as expressions, so you can do things like ... WHERE COUNT(SELECT ... ) > 0. Could be that this is supported, though, I only gave the code a casual grep and could't see anything.

A more significant problem for me is that Arel has a kind of schizophrenia about AST vs. actual relational logic. For example, Table is not an AST node, and project produces a SelectManager, which isn't a node, either. SelectManager acts as a sort of builder. It would be natural to use a projection as a subquery, but it's not a node, so that's not supported (although I bet there's a way of extracting a select manager's expression).

I understand the logic behind this design, but it makes for a somewhat weird structure. I wish that it was possible to have a "pure" AST, with higher-level wrapper classes -- cleanly separated from the AST -- that implemented the actual relational operators.

@tenderlove tenderlove merged commit b757de1 into rails:master Feb 23, 2012
@tenderlove
Ruby on Rails member

Ya, most of the API weirdness is due to maintaining backwards compat with ARel 1.0 (since I had to do that 😢). I totally agree about your view of a pure AST with higher level classes. I'd love to get to that point. If you have time to make those kinds of changes, please do it. We (or I) can adjust rails internals accordingly.

If we had the pure AST, I think we could get to the point of adding a SQL parser along with a SQL AST => Ruby visitor. Then we could finally have the Ruby => SQL => Ruby circle of life! cue lion king soundtrack :-D

@srpouyet

Hey @alexstaubo did you manage to implement PARTITION BY? I searched the arel source but was unable to find any reference to it.

Cheers!

@atombender

@serepo, just added a pull request for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment