Permalink
Browse files

REAME updated: remove whitespaces, fix invalid join in query output.

  • Loading branch information...
1 parent 3d747a5 commit 755a7ced2f98b0bb246089c80cdfa04cd918fa89 @miloops miloops committed Aug 23, 2009
Showing with 26 additions and 26 deletions.
  1. +26 −26 README.markdown
View
@@ -15,20 +15,20 @@ The long term goal, following both LINQ and DataMapper, is to have Arel adapt to
Generating a query with ARel is simple. For example, in order to produce
SELECT * FROM users
-
+
you construct a table relation and convert it to sql:
users = Table(:users)
users.to_sql
-
+
In fact, you will probably never call `#to_sql`. Rather, you'll work with data from the table directly. You can iterate through all rows in the `users` table like this:
users.each { |user| ... }
-
+
In other words, Arel relations implement Ruby's Enumerable interface. Let's have a look at a concrete example:
users.first # => { users[:id] => 1, users[:name] => 'bob' }
-
+
As you can see, Arel converts the rows from the database into a hash, the values of which are sublimated to the appropriate Ruby primitive (integers, strings, and so forth).
### More Sophisticated <strike>Queries</strike> Relations ###
@@ -43,7 +43,7 @@ First is the 'restriction' operator, `where`:
What would, in SQL, be part of the `SELECT` clause is called in Arel a `projection`:
users.project(users[:id]) # => SELECT users.id FROM users
-
+
Joins resemble SQL strongly:
users.join(photos).on(users[:id].eq(photos[:user_id]))
@@ -53,7 +53,7 @@ What are called `LIMIT` and `OFFSET` in SQL are called `take` and `skip` in Arel
users.take(5) # => SELECT * FROM users LIMIT 5
users.skip(4) # => SELECT * FROM users OFFSET 4
-
+
`GROUP BY` is called `group`:
users.group(users[:name]) # => SELECT * FROM users GROUP BY name
@@ -68,15 +68,15 @@ The best property of the Relational Algebra is its "composability", or closure u
All operators are chainable in this way, and they are chainable any number of times, in any order.
users.where(users[:name].eq('bob')).where(users[:age].lt(25))
-
+
Of course, many of the operators take multiple arguments, so the last example can be written more tersely:
users.where(users[:name].eq('bob'), users[:age].lt(25))
The `OR` operator is not yet supported. It will work like this:
users.where(users[:name].eq('bob').or(users[:age].lt(25)))
-
+
The `AND` operator will behave similarly.
Finally, most operations take a block form. For example:
@@ -96,7 +96,7 @@ The examples above are fairly simple and other libraries match or come close to
Where Arel really shines in its ability to handle complex joins and aggregations. As a first example, let's consider an "adjacency list", a tree represented in a table. Suppose we have a table `comments`, representing a threaded discussion:
comments = Table(:comments)
-
+
And this table has the following attributes:
comments.attributes # => [comments[:id], comments[:body], comments[:parent_id]]
@@ -107,23 +107,23 @@ The `parent_id` column is a foreign key from the `comments` table to itself. Now
comments_with_replies = \
comments.join(replies).on(replies[:parent_id].eq(comments[:id]))
# => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments_2.parent_id = comments.id
-
+
The call to `#alias` is actually optional: Arel will always produce a unique name for every table joined in the relation, and it will always do so deterministically to exploit query caching. Explicit aliasing is more common, however. When you want to extract specific slices of data, aliased tables are a necessity. For example to get just certain columns from the row, treat a row like a hash:
comments_with_replies.first[replies[:body]]
-
+
This will return the first comment's reply's body.
If you don't need to extract the data later (for example, you're simply doing a join to find comments that have replies, you don't care what the content of the replies are), the block form may be preferable:
comments.join(comments) { |comments, replies| replies[:parent_id].eq(comments[:id]) }
# => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments_2.parent_id = comments.id
-
+
Note that you do NOT want to do something like:
comments.join(comments, comments[:parent_id].eq(comments[:id]))
# => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments.parent_id = comments.id
-
+
This does NOT have the same meaning as the previous query, since the comments[:parent_id] reference is effectively ambiguous.
#### Complex Aggregations ####
@@ -135,24 +135,24 @@ The easiest way to introduce this is in SQL. Your task is to get all users and t
SELECT count(*)
FROM photos
GROUP BY user_id
-
+
Now, we'd like to join this with the user table. Naively, you might try to do this:
SELECT users.*, count(photos.id)
FROM users
LEFT OUTER JOIN photos
- ON users.id = photos.id
+ ON users.id = photos.user_id
GROUP BY photos.user_id
-
+
Of course, this has a slightly different meaning than our intended query. This is actually a fairly advanced topic in SQL so let's see why this doesn't work *step by step*. Suppose we have these records in our `users` table:
mysql> select * from users;
+------+--------+
| id | name |
+------+--------+
- | 1 | hai |
- | 2 | bai |
- | 3 | dumpty |
+ | 1 | hai |
+ | 2 | bai |
+ | 3 | dumpty |
+------+--------+
And these in the photos table:
@@ -161,24 +161,24 @@ And these in the photos table:
+------+---------+-----------+
| id | user_id | camera_id |
+------+---------+-----------+
- | 1 | 1 | 1 |
- | 2 | 1 | 1 |
- | 3 | 1 | 1 |
+ | 1 | 1 | 1 |
+ | 2 | 1 | 1 |
+ | 3 | 1 | 1 |
+------+---------+-----------+
-
+
If we perform the above, incorrect query, we get the following:
mysql> select users.*, count(photos.id) from users left outer join photos on users.id = photos.user_id limit 3 group by user_id;
+------+------+------------------+
| id | name | count(photos.id) |
+------+------+------------------+
- | 2 | bai | 0 |
- | 1 | hai | 3 |
+ | 2 | bai | 0 |
+ | 1 | hai | 3 |
+------+------+------------------+
As you can see, we're completely missing data for user with id 3. `dumpty` has no photos, neither does `bai`. But strangely `bai` appeared and `dumpty` didn't! The reason is that the `GROUP BY` clause is aggregating on both tables, not just the `photos` table. All users without photos have a `photos.id` of `null` (thanks to the left outer join). These are rolled up together and an arbitrary user wins. In this case, `bai` not `dumpty`.
SELECT users.*, photos_aggregation.cnt
FROM users
LEFT OUTER JOIN (SELECT user_id, count(*) as cnt FROM photos GROUP BY user_id) AS photos_aggregation
- ON photos_aggregation.user_id = users.id
+ ON photos_aggregation.user_id = users.id

0 comments on commit 755a7ce

Please sign in to comment.