Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for recursive CTEs in ActiveRecord #51601

Merged
merged 1 commit into from May 2, 2024

Conversation

ClearlyClaire
Copy link
Contributor

@ClearlyClaire ClearlyClaire commented Apr 18, 2024

Motivation / Background

While Rails 7.1 has added support for writing Common Table Expressions, this support does not extend to recursive CTEs.

This PR adds a QueryMethods#with_recursive construct to enable recursive CTEs.

In addition, it enables passing arrays of queries as with hash values, as recursive CTEs are generally only useful with UNION or UNION ALL, which, to my knowledge, are currently not exposed outside of Arel.

Detail

This Pull Request adds a WithChain placeholder object to chain with and recursive, in the manner of where.not. Calling QueryMethods#with_recursive returns a scope with a with_is_recursive flag turned on, which is then used in build_with to pass :recursive to Arel.

The Pull Request also changes build_with_value_from_hash to allow arrays of values as queries, in which case the subqueries are joined with UNION ALL.

This enables writing queries such as:

Post.with_recursive(
  post_and_replies: [
    Post.where(id: 42),
    Post.joins('JOIN post_and_replies ON posts.in_reply_to_id = post_and_replies.id'),
  ]
)

Generating the following SQL:

WITH RECURSIVE "post_and_replies" AS (
  (SELECT "posts".* FROM "posts" WHERE "posts"."id" = 42)
  UNION ALL
  (SELECT "posts".* FROM "posts" JOIN post_and_replies ON posts.in_reply_to_id = post_and_replies.id)
)
SELECT "posts".* FROM "posts"

Additional information

Several syntaxes have previously been discussed to support recursive CTEs, as well as unions.

I took a lot of inspiration from that last monkey-patch, but the API felt significantly at odds with the existing QueryMethods#with, so I decided to go with the .with_recursive approach and extend that interface to offer to UNION ALL sub-queries.

I have decided to put both the array support and the .with_recursive support in the same PR, as they are closely related, but I can split it into two PRs if preferred.

Tests are updated accordingly, but don't actually test inherently recursive queries, as I'm not sure there are existing fixtures lending themselves to that.

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Unrelated changes should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

activerecord/test/cases/relation/with_test.rb Outdated Show resolved Hide resolved
Comment on lines 77 to 81
# TODO: actually test recursive behavior
relation = Post
.with.recursive(posts_with_comments: Post.where("legacy_comments_count > 0"))
.from("posts_with_comments AS posts")

assert_equal POSTS_WITH_COMMENTS, relation.order(:id).pluck(:id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the test fixtures and could not find ones that lend themselves to testing actual recursive CTEs.

activerecord/lib/active_record/relation/query_methods.rb Outdated Show resolved Hide resolved
@casperisfine
Copy link
Contributor

FYI: @vlado

Copy link
Contributor

@casperisfine casperisfine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly a single comment on API, but other than that, the implementation seem relatively straightforward.

Comment on lines 149 to 162
class WithChain
def initialize(scope) # :nodoc:
@scope = scope
end

# Returns a new relation in which Common Table Expressions (CTEs) are flagged as recursive.
#
# See QueryMethods#with for more details.
def recursive(*args)
@scope.with_values += args
@scope.with_is_recursive = true
@scope
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need this. I think .with_recursive is just as fine as .with.recursive.

Unless I'm missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking .with.recursive was more in line with existing query methods, but from a usage perspective, with_recursive would work as well and should make the implementation simpler. I will switch to that in another commit I can then squash if we decide this is the way to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I assume you were thinking of where.not / where.associated / where.missing.

But in that case, I doubt we'll ever need multiple with_, but I see how as a enduser it might appear inconsistent :/

Copy link
Contributor

@vlado vlado Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think .with.recursive is more in line with existing query methods and would prefer that one.

Having said that, as @ClearlyClaire pointed out .with_recursive is simpler to implement (due to missing union support in ActiveRecord) and I prefer passing relations as positional arguments.

Post.with.recursive(
  post_and_replies: [
    Post.where(id: 42),
    Post.joins('JOIN post_and_replies ON posts.in_reply_to_id = post_and_replies.id'),
  ]
)

# vs

Post.with_recursive(:post_and_replies, Post.where(id: 42), Post.joins('JOIN post_and_replies ON posts.in_reply_to_id = post_and_replies.id')

Or maybe we should add union support before implementing this 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed a commit to change to QueryMethods#with_recursive, which works exactly the same way as QueryMethods#with except it also sets an instance variable to mark the WITH with RECURSIVE.

It's a separate commit so we can easily go back to with.recursive if that is deemed better.

Having said that, as @ClearlyClaire pointed out .with_recursive is simpler to implement (due to missing union support in ActiveRecord) and I prefer passing relations as positional arguments.

I actually prefer the array because it seems to me that it is more consistent with QueryMethods#with, and also allows passing a single SQL snippet that does the UNION, should the user prefer doing that. That being said, even with positional arguments we could make the last one optional to allow that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 with_recursive seems better here, since chaining would serve only one purpose.

@casperisfine
Copy link
Contributor

I added the missing test as well as a CHANGELOG entry. I'll merge on green.

```ruby
Post.with_recursive(
  post_and_replies: [
    Post.where(id: 42),
    Post.joins('JOIN post_and_replies ON posts.in_reply_to_id = post_and_replies.id'),
  ]
)
```

Generates the following SQL:

```sql
WITH RECURSIVE "post_and_replies" AS (
  (SELECT "posts".* FROM "posts" WHERE "posts"."id" = 42)
  UNION ALL
  (SELECT "posts".* FROM "posts" JOIN post_and_replies ON posts.in_reply_to_id = post_and_replies.id)
)
SELECT "posts".* FROM "posts"
```
@byroot byroot merged commit 26e53f4 into rails:main May 2, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants