Skip to content
Newer
Older
100644 185 lines (114 sloc) 8.46 KB
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
1 ## Abstract ##
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 23, 2008
2
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
3 Arel is a Relational Algebra for Ruby. It 1) simplifies the generation complex of SQL queries and it 2) adapts to various RDBMS systems. It is intended to be a framework framework; that is, you can build your own ORM with it, focusing on innovative object and collection modeling as opposed to database compatibility and query generation.
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
4
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
5 ## Status ##
6
7 Arel is alpha software, BEWARE. Nevertheless, at this point, many (most?) SELECT queries can be composed, including very very complicated ones. Writes are only experimental for now.
8
9 For the moment, Arel uses ActiveRecord's connection adapters to connect to the various engines, connection pooling, perform quoting, and do type conversion. On the horizon is the use of DataObjects instead.
10
11 The long term goal, following both LINQ and DataMapper, is to have Arel adapt to engines beyond RDBMS, including XML, IMAP, YAML, etc.
12
13 ## A Gentle Introduction ##
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
14
15 Generating a query with ARel is simple. For example, in order to produce
16
17 SELECT * FROM users
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
18
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
19 you construct a table relation and convert it to sql:
20
7032a50 @brynary reorganized file structures
brynary authored May 17, 2009
21 users = Table(:users)
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
22 users.to_sql
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
23
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
24 In fact, you will probably never call `#to_sql`. Rather, you'll work with data from the table directly. You can iterate through all rows in the `users` table like this:
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
25
26 users.each { |user| ... }
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
27
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
28 In other words, Arel relations implement Ruby's Enumerable interface. Let's have a look at a concrete example:
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
29
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
30 users.first # => { users[:id] => 1, users[:name] => 'bob' }
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
31
ecd072d renamed attribute to operand per josh's suggestion
Nick Kallen authored Feb 24, 2008
32 As you can see, Arel converts the rows from the database into a hash, the values of which are sublimated to the appropriate Ruby primitive (integers, strings, and so forth).
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
33
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
34 ### More Sophisticated <strike>Queries</strike> Relations ###
35
36 Here is a whirlwind tour through the most common relational operators. These will probably cover 80% of all interaction with the database.
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
37
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
38 First is the 'restriction' operator, `where`:
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
39
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
40 users.where(users[:name].eq('amy'))
41 # => SELECT * FROM users WHERE users.name = 'amy'
42
43 What would, in SQL, be part of the `SELECT` clause is called in Arel a `projection`:
44
45 users.project(users[:id]) # => SELECT users.id FROM users
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
46
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
47 Joins resemble SQL strongly:
48
49 users.join(photos).on(users[:id].eq(photos[:user_id]))
50 # => SELECT * FROM users INNER JOIN photos ON users.id = photos.user_id
51
52 What are called `LIMIT` and `OFFSET` in SQL are called `take` and `skip` in Arel:
53
54 users.take(5) # => SELECT * FROM users LIMIT 5
55 users.skip(4) # => SELECT * FROM users OFFSET 4
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
56
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
57 `GROUP BY` is called `group`:
58
59 users.group(users[:name]) # => SELECT * FROM users GROUP BY name
60
61 The best property of the Relational Algebra is its "composability", or closure under all operations. For example, to select AND project, just "chain" the method invocations:
62
63 users \
64 .where(users[:name].eq('amy')) \
65 .project(users[:id]) \
66 # => SELECT users.id FROM users WHERE users.name = 'amy'
67
68 All operators are chainable in this way, and they are chainable any number of times, in any order.
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
69
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
70 users.where(users[:name].eq('bob')).where(users[:age].lt(25))
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
71
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
72 Of course, many of the operators take multiple arguments, so the last example can be written more tersely:
73
74 users.where(users[:name].eq('bob'), users[:age].lt(25))
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
75
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
76 The `OR` operator is not yet supported. It will work like this:
77
78 users.where(users[:name].eq('bob').or(users[:age].lt(25)))
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
79
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
80 The `AND` operator will behave similarly.
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
81
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
82 Finally, most operations take a block form. For example:
83
7032a50 @brynary reorganized file structures
brynary authored May 17, 2009
84 Table(:users) \
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
85 .where { |u| u[:id].eq(1) } \
86 .project { |u| u[:id] }
87
88 This provides a (sometimes) convenient alternative syntax.
89
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
90 ### The Crazy Features ###
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
91
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
92 The examples above are fairly simple and other libraries match or come close to matching the expressiveness of Arel (e.g., `Sequel` in Ruby).
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
93
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
94 #### Complex Joins ####
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
95
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
96 Where Arel really shines in its ability to handle complex joins and aggregations. As a first example, let's consider an "adjacency list", a tree represented in a table. Suppose we have a table `comments`, representing a threaded discussion:
97
7032a50 @brynary reorganized file structures
brynary authored May 17, 2009
98 comments = Table(:comments)
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
99
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
100 And this table has the following attributes:
ecd072d renamed attribute to operand per josh's suggestion
Nick Kallen authored Feb 25, 2008
101
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
102 comments.attributes # => [comments[:id], comments[:body], comments[:parent_id]]
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
103
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
104 The `parent_id` column is a foreign key from the `comments` table to itself. Now, joining a table to itself requires aliasing in SQL. In fact, you may alias in Arel as well:
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
105
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
106 replies = comments.alias
107 comments_with_replies = \
108 comments.join(replies).on(replies[:parent_id].eq(comments[:id]))
109 # => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments_2.parent_id = comments.id
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
110
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
111 The call to `#alias` is actually optional: Arel will always produce a unique name for every table joined in the relation, and it will always do so deterministically to exploit query caching. Explicit aliasing is more common, however. When you want to extract specific slices of data, aliased tables are a necessity. For example to get just certain columns from the row, treat a row like a hash:
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
112
113 comments_with_replies.first[replies[:body]]
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
114
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
115 This will return the first comment's reply's body.
116
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
117 If you don't need to extract the data later (for example, you're simply doing a join to find comments that have replies, you don't care what the content of the replies are), the block form may be preferable:
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
118
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
119 comments.join(comments) { |comments, replies| replies[:parent_id].eq(comments[:id]) }
120 # => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments_2.parent_id = comments.id
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
121
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
122 Note that you do NOT want to do something like:
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
123
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
124 comments.join(comments, comments[:parent_id].eq(comments[:id]))
125 # => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments.parent_id = comments.id
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
126
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
127 This does NOT have the same meaning as the previous query, since the comments[:parent_id] reference is effectively ambiguous.
2654c29 test coverage of #prefix_for on join.
Nick Kallen authored Mar 12, 2008
128
3eae3b0 renamed select operation to where
Nick Kallen authored May 19, 2008
129 #### Complex Aggregations ####
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
130
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
131 My personal favorite feature of Arel, certainly the most difficult to implement, and possibly only of marginal value, is **closure under joining even in the presence of aggregations**. This is a feature where the Relational Algebra is fundamentally easier to use than SQL. Think of this as a preview of the kind of radical functionality that is to come, stuff no other "ORM" is doing.
132
133 The easiest way to introduce this is in SQL. Your task is to get all users and the **count** of their associated photos. Let's start from the inside out:
d3c7c37 added primitive update functionality
Nick Kallen authored Feb 24, 2008
134
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
135 SELECT count(*)
136 FROM photos
137 GROUP BY user_id
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
138
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
139 Now, we'd like to join this with the user table. Naively, you might try to do this:
140
141 SELECT users.*, count(photos.id)
142 FROM users
143 LEFT OUTER JOIN photos
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
144 ON users.id = photos.user_id
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
145 GROUP BY photos.user_id
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
146
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
147 Of course, this has a slightly different meaning than our intended query. This is actually a fairly advanced topic in SQL so let's see why this doesn't work *step by step*. Suppose we have these records in our `users` table:
148
149 mysql> select * from users;
150 +------+--------+
151 | id | name |
152 +------+--------+
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
153 | 1 | hai |
154 | 2 | bai |
155 | 3 | dumpty |
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
156 +------+--------+
157
158 And these in the photos table:
159
160 mysql> select * from photos;
161 +------+---------+-----------+
162 | id | user_id | camera_id |
163 +------+---------+-----------+
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
164 | 1 | 1 | 1 |
165 | 2 | 1 | 1 |
166 | 3 | 1 | 1 |
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
167 +------+---------+-----------+
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
168
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
169 If we perform the above, incorrect query, we get the following:
170
171 mysql> select users.*, count(photos.id) from users left outer join photos on users.id = photos.user_id limit 3 group by user_id;
172 +------+------+------------------+
173 | id | name | count(photos.id) |
174 +------+------+------------------+
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
175 | 2 | bai | 0 |
176 | 1 | hai | 3 |
41f80e4 limits and offsets need to be externalized too. first draft
Nick Kallen authored May 20, 2008
177 +------+------+------------------+
178
179 As you can see, we're completely missing data for user with id 3. `dumpty` has no photos, neither does `bai`. But strangely `bai` appeared and `dumpty` didn't! The reason is that the `GROUP BY` clause is aggregating on both tables, not just the `photos` table. All users without photos have a `photos.id` of `null` (thanks to the left outer join). These are rolled up together and an arbitrary user wins. In this case, `bai` not `dumpty`.
180
181 SELECT users.*, photos_aggregation.cnt
182 FROM users
183 LEFT OUTER JOIN (SELECT user_id, count(*) as cnt FROM photos GROUP BY user_id) AS photos_aggregation
755a7ce @miloops REAME updated: remove whitespaces, fix invalid join in query output.
miloops authored Aug 23, 2009
184 ON photos_aggregation.user_id = users.id
Something went wrong with that request. Please try again.