Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for annotating queries generated by ActiveRecord::Relation with SQL comments #35617

Merged

Conversation

@mattyoho
Copy link
Contributor

mattyoho commented Mar 15, 2019

Summary

This pull request implements a small new feature that allows users to annotate generated SQL queries.

A new method, ActiveRecord::Relation#annotate, has been added that allows annotating queries generated from the relation instance with block-style SQL comments.

For example:

Post.annotate("this is a comment").where(id: 123).to_sql
# SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123 /* this is a comment */

The hope is to land this feature in time for Rails 6.0. 馃檪馃

Discussion

This kind of annotation can be extremely useful when combined with instrumentation or other methods of tracking and analyzing queries generated at runtime, particularly in a large, mature application.

The marginalia gem provides similarly useful functionality in some contexts, but currently operates on all queries within a request or job and can't be used to target specific queries.

By hooking into ActiveRecord::Relation it's possible to annotate arbitrary scopes and model associations.

class Post < ActiveRecord::Base
  has_many :tags, -> { annotate("association comment") }

  scope :published, -> { where.not(published_at: nil).annotate("scope comment") }
end

This PR also updates ActiveRecord::Relation#update_all and ActiveRecord::Relation#delete_all to respect any annotations on the receiving relation object.

Multiple annotations may be added:

Post.where(id: 123).with_annotation("foo").annotate("bar")
# SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123 /* foo */ /* bar */

Not all queries issued from ActiveRecord can be annotated using this approach, but many can.

Implementation

The implementation extends Arel under the hood by adding a new Arel::Nodes::Comment node and associated plumbing to work with them.

Of note, Arel::Nodes::Comment is responsible for preventing SQL injection via comment escaping.

Comment content is scrubbed of SQL block comment delimiters at creation time here. This should be reviewed for correctness. 馃憖

Example use with instrumentation

Here's a toy example of how this could be used with ActiveSupport::Notifications instrumentation to perform basic analysis.

query_checker = Proc.new do |_, _, _, _, payload|
  query = payload[:sql]
  if query =~ /JOIN "forbidden_table"/ && query !~ %r{/\* whitelisted \*/}
    Rails.logger.warn("this is bad!")
  end
end
ActiveSupport::Notifications.subscribe "sql.active_record", query_logger

# This would be all right
Post.annotate("whitelisted").joins(:forbidden_table).first
# But this would raise eyebrows
Post.joins(:forbidden_table).first

Rails Guides

It might be useful to add documentation for this new method to the Active Record Query Interface page.

I can provide a separate PR for that or add it here if so.

@rails-bot rails-bot bot added the activerecord label Mar 15, 2019

@jeremy jeremy referenced this pull request Mar 15, 2019

Merged

Support Optimizer Hints #35615

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from f64d332 to c4bf5a7 Mar 15, 2019

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from c4bf5a7 to e049e00 Mar 15, 2019

@jeremy
Copy link
Member

jeremy left a comment

Welcome addition and great, thorough PR. Thank you!

Show resolved Hide resolved activerecord/lib/arel/nodes/select_core.rb
Show resolved Hide resolved activerecord/lib/arel/nodes/comment.rb Outdated
Show resolved Hide resolved activerecord/CHANGELOG.md
Show resolved Hide resolved activerecord/test/cases/arel/nodes/comment_test.rb Outdated
@mattyoho
Copy link
Contributor Author

mattyoho left a comment

Thanks for the code review, @jeremy!

I've switched to adding distinct commits while I'm iterating based on review feedback. I can squash back down toward the end.

The Support Optimizer Hints PR in #35615 is really interesting in relation to this one. Edit: Or vice-versa. 馃槃

From #35615 (comment):

Should we ensure the hint is safe for a SQL comment, preventing query injection?

Could reuse the Arel Comment node #35617

I do think there'd be value in some slight generalization here to allow #35615 to reuse some of this. I've left some comments inline discussing it more.

In particular in this PR since the last review:

  • 4a4003a renames Arel::Nodes::Comment to Arel::Nodes::Annotation.
  • acdd316 demonstrates Arel::Nodes::Annotation representing an optimization hint within the visitor.
Show resolved Hide resolved activerecord/CHANGELOG.md
Show resolved Hide resolved activerecord/lib/arel/nodes/select_core.rb
Show resolved Hide resolved activerecord/test/cases/arel/nodes/comment_test.rb Outdated
Show resolved Hide resolved activerecord/lib/arel/nodes/comment.rb Outdated

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch 5 times, most recently from 874918b to 8235e9b Mar 16, 2019

@mattyoho

This comment has been minimized.

Copy link
Contributor Author

mattyoho commented Mar 17, 2019

Howdy. 馃憢

I've iterated based on some review feedback and self-review, added some additional tests, and incorporated the latest changes from master, including #35615.

I think this patch is now in a state where it could be merged and so is ready for review again. 馃檪馃

I think it's worth me pointing out that I've refactored the additions in #35615 a bit after incorporating them into this branch. Hopefully that seems reasonable after looking at the diff.

In particular, I've changed Arel::Nodes::OptimizerHints to inherit from Arel::Nodes::Annotation rather than Arel::Nodes::Unary.

I think this makes conceptual sense, but the biggest practical advantage comes from reusing the comment sanitization routine rather than having two adjacent implementations.

You can see the changes specific to this refactor isolated in this Gist.

Thank you!

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from 8235e9b to 6ea7580 Mar 17, 2019

Show resolved Hide resolved activerecord/test/cases/relation_test.rb Outdated
Show resolved Hide resolved activerecord/lib/arel/insert_manager.rb Outdated
Show resolved Hide resolved activerecord/lib/arel/nodes/annotation.rb Outdated

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch 2 times, most recently from 9d2831e to a2d0fe2 Mar 18, 2019

@mattyoho
Copy link
Contributor Author

mattyoho left a comment

I've rebased off latest master and attempted to respond to the most recent feedback.

The #annotate query method now expects a String or Arel::Nodes::SqlLiteral argument.

Sanitization of SQL block comment delimiters is still happening at the node level, within the Arel::Nodes::Annotation mixin, for the reasons I proposed in #35617 (comment), but I'm of course willing to incorporate further feedback on that based on the discussion. 馃檪

Show resolved Hide resolved activerecord/lib/arel/insert_manager.rb Outdated

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch 2 times, most recently from 16c7d2b to 0455475 Mar 18, 2019

@mattyoho

This comment has been minimized.

Copy link
Contributor Author

mattyoho commented Mar 18, 2019

Sanitization of SQL block comment delimiters is still happening at the node level, within the Arel::Nodes::Annotation mixin, for the reasons I proposed in #35617 (comment), but I'm of course willing to incorporate further feedback on that based on the discussion.

When #35660 is merged, I'll update this diff to reuse that approach. 馃憤

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch 5 times, most recently from 8863462 to d5c7a85 Mar 19, 2019

@mattyoho

This comment has been minimized.

Copy link
Contributor Author

mattyoho commented Mar 19, 2019

The latest base branch changes have been incorporated and sanitization in the visitors has been embraced. I think this is ready for review again. /cc @jeremy

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch 2 times, most recently from 4c3fb9e to 9c1adca Mar 20, 2019

@eileencodes
Copy link
Member

eileencodes left a comment

Looks good to me. @jeremy anymore concerns or do you think this is good to merge?

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from 9c1adca to 8aaf0eb Mar 20, 2019

@mattyoho
Copy link
Contributor Author

mattyoho left a comment

There is no worth to share Arel::Nodes::Annotation anymore.

Removed with latest push.

@kamipo kamipo self-assigned this Mar 20, 2019

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from 8aaf0eb to 3f87e66 Mar 20, 2019

Show resolved Hide resolved activerecord/lib/arel/table.rb Outdated

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from 3f87e66 to 8a1f7ba Mar 21, 2019

end

def visit_Arel_Nodes_OptimizerHints(o, collector)
hints = o.expr.map { |v| sanitize_as_sql_comment(v) }.join(" ")
collector << "/*+ #{hints} */"
end

def visit_Arel_Nodes_Comment(o, collector)
comments = o.values.map { |v| sanitize_as_sql_comment(v) }.join(" ")

This comment has been minimized.

Copy link
@kamipo

kamipo Mar 21, 2019

Member

How about joining multiple comments with "; "?

I think it is one solution to mitigate the issue Post.annotate("post comment").joins(:tags).merge(Tag.annotate("tag comment")) #=> "/* post comment tag comment */".

This comment has been minimized.

Copy link
@mattyoho

mattyoho Mar 21, 2019

Author Contributor

I don't think text content should be inserted into the user's annotation if it can be avoided. As a consumer of the API I wouldn't want that.

After more reflection I think the best approach by far is actually to wrap each annotation in its own delimiters, e.g.,

Post.annotate("post comment").joins(:tags).merge(Tag.annotate("tag comment"))
#=> "... /* post comment */ /* tag comment */"

This solves the concerns with running them together, it makes it much simpler to detect or parse out annotation in instrumentation, and it prevents different developers from stepping on one another's annotations.

I've been using a "backport" of this API in an experimental branch for a large refactor and after working with it a bit this is the behavior I'd want to have.

Show resolved Hide resolved activerecord/test/cases/associations_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/associations_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/associations_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/associations_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/relation/delete_all_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/relation/update_all_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/relation_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/scoping/relation_scoping_test.rb Outdated
Show resolved Hide resolved activerecord/test/cases/scoping/relation_scoping_test.rb Outdated

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch 4 times, most recently from cd8e57b to fe87acf Mar 21, 2019

Add Relation#annotate for SQL commenting
This patch has two main portions:

1. Add SQL comment support to Arel via Arel::Nodes::Comment.
2. Implement a Relation#annotate method on top of that.

== Adding SQL comment support

Adds a new Arel::Nodes::Comment node that represents an optional SQL
comment and teachers the relevant visitors how to handle it.

Comment nodes may be added to the basic CRUD statement nodes and set
through any of the four (Select|Insert|Update|Delete)Manager objects.

For example:

    manager = Arel::UpdateManager.new
    manager.table table
    manager.comment("annotation")
    manager.to_sql # UPDATE "users" /* annotation */

This new node type will be used by ActiveRecord::Relation to enable
query annotation via SQL comments.

== Implementing the Relation#annotate method

Implements `ActiveRecord::Relation#annotate`, which accepts a comment
string that will be appeneded to any queries generated by the relation.

Some examples:

    relation = Post.where(id: 123).annotate("metadata string")
    relation.first
    # SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123
    # LIMIT 1 /* metadata string */

    class Tag < ActiveRecord::Base
      scope :foo_annotated, -> { annotate("foo") }
    end
    Tag.foo_annotated.annotate("bar").first
    # SELECT "tags".* FROM "tags" LIMIT 1 /* foo */ /* bar */

Also wires up the plumbing so this works with `#update_all` and
`#delete_all` as well.

This feature is useful for instrumentation and general analysis of
queries generated at runtime.

@mattyoho mattyoho force-pushed the mattyoho:add-annotation-support-to-relations branch from fe87acf to f418258 Mar 22, 2019

@mattyoho

This comment has been minimized.

Copy link
Contributor Author

mattyoho commented Mar 22, 2019

@kamipo Thank you for the thorough code review. The patch is much better because of it.

I believe this PR is once again ready for review and hopefully merging.

@eileencodes eileencodes merged commit f408608 into rails:master Mar 22, 2019

3 checks passed

buildkite/rails Build #59757 passed (15 minutes, 9 seconds)
Details
codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@arthurnn

This comment has been minimized.

Copy link
Member

arthurnn commented Mar 22, 2019

On marginalia we do some escaping on the comment, to make sure people will not annotate queries with user input and be open to SQL injections.
Is that necessary in here? (I scammed the code fast, and haven't saw any prevention like that)

@mattyoho

This comment has been minimized.

Copy link
Contributor Author

mattyoho commented Mar 22, 2019

@arthurnn That's being handled here:

collector << o.values.map { |v| "/* #{sanitize_as_sql_comment(v)} */" }.join(" ")

(Or here in this diff.)

@mattyoho

This comment has been minimized.

Copy link
Contributor Author

mattyoho commented Mar 22, 2019

(There are tests, e.g, here and here, as well.)

@@ -59,6 +59,7 @@ To retrieve objects from the database, Active Record provides several finder met

The methods are:

* `annotate`

This comment has been minimized.

Copy link
@abhaynikam

abhaynikam Mar 24, 2019

Contributor

Is annotate a finder method? Just curious to understand this because the documentation on the above line says

To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.

As per the changelog, it says Add 'ActiveRecord::Relation#annotate' for adding SQL comments to its queries.

This comment has been minimized.

Copy link
@mattyoho

mattyoho Mar 24, 2019

Author Contributor

Hi, @abhaynikam! It's possible the description should be generalized a bit. Several other relation query methods arguably don't fit that documention sentence literally, either 鈥 preload, create_with, and includes being examples.

Perhaps the intro sentence could be updated in another PR.

This comment has been minimized.

Copy link
@abhaynikam

abhaynikam Mar 24, 2019

Contributor

Thanks, @mattyoho . That makes sense, I agree a few other methods do not fit the description.

Great addition @mattyoho 馃帀

This comment has been minimized.

Copy link
@abhaynikam

abhaynikam Mar 24, 2019

Contributor

About updating the description, I would say let remove methods which do not fit description sinces that particular documentation section is for understanding how to retrieve objects from database.

I'll like to raise a PR if somebody from Rails team approve it is correct to update the doc 馃槉

kamipo added a commit to kamipo/rails that referenced this pull request Apr 1, 2019

Revert unused code and re-using query annotation for `update_all` and鈥
鈥 `delete_all`

This partly reverts rails#35617.

rails#35617 includes unused code (for `InsertStatement`) and re-using query
annotation for `update_all` and `delete_all`, which has not been
discussed yet.

If a relation has any annotation, I think it is mostly for SELECT query,
so re-using annotation by default is not always desired behavior for me.

We should discuss about desired behavior before publishing the
implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.