Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Optimizer Hints #35615

Merged
merged 1 commit into from Mar 16, 2019
Merged

Support Optimizer Hints #35615

merged 1 commit into from Mar 16, 2019

Conversation

kamipo
Copy link
Member

@kamipo kamipo commented Mar 14, 2019

We as Arm Treasure Data are using Optimizer Hints with a monkey patch
(https://gist.github.com/kamipo/4c8539f0ce4acf85075cf5a6b0d9712e),
especially in order to use MAX_EXECUTION_TIME (refer #31129).

Example:

class Job < ApplicationRecord
  default_scope { optimizer_hints("MAX_EXECUTION_TIME(50000) NO_INDEX_MERGE(jobs)") }
end

Optimizer Hints is supported not only for MySQL but also for most
databases (PostgreSQL on RDS, Oracle, SQL Server, etc), it is really
helpful to turn heavy queries for large scale applications.

@kaspth
Copy link
Contributor

kaspth commented Mar 14, 2019

Should the exposed API be closer to something like this?

class Job < ApplicationRecord
  query_optimize max_execution_time: 50.minutes, no_index_merge: true
end

Then we’d add an option for other hints as well.

Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, and welcome.

@jeremy
Copy link
Member

jeremy commented Mar 15, 2019

@kaspth In our usage, we provide optimizer hints for specific queries and scopes. A class-level wrapper declaration could work for default scoping, but the lower level relation support is desirable.

Copy link
Member

@rafaelfranca rafaelfranca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@mattyoho
Copy link
Contributor

mattyoho commented Mar 15, 2019

Hi, @kamipo! 👋

I don't want to derail this pull request, so my apologies if this feels derailing.

I thought it was amusing that this pull request and #35617, which aimed to add plain SQL comment support to ActiveRecord::Relation, were opened within an hour or two of one another given their similarity.

@jeremy's comment at #35615 (comment) caused me to think about what it would take to generalize #35617 a bit so that it could act as underpinning for the new #optimizer_hints query method.

Given how similar the concepts are, it seems as though there would be value in sharing an abstraction.

I've since made some changes over in #35617, including generalizing an Arel::Nodes::Annotation node and adding hooks to customize how a visitor might generate output from that node.

Here's a small synthetic patch demonstrating how I think it could work:

diff --git a/activerecord/lib/arel/select_manager.rb b/activerecord/lib/arel/select_manager.rb
index 8a8bd3b..91db035 100644
--- a/activerecord/lib/arel/select_manager.rb
+++ b/activerecord/lib/arel/select_manager.rb
@@ -148,7 +148,7 @@ module Arel # :nodoc: all

     def optimizer_hints(*hints)
       unless hints.empty?
-        @ctx.optimizer_hints = Arel::Nodes::OptimizerHints.new(hints)
+        @ctx.optimizer_hints = Arel::Nodes::Annotation.new(hints, hint: true)
       end
       self
     end
@@ -263,4 +263,4 @@ module Arel # :nodoc: all
         end
       end
   end
 end
diff --git a/activerecord/lib/arel/visitors/ibm_db.rb b/activerecord/lib/arel/visitors/ibm_db.rb
index 73166054da..05908b0fc0 100644
--- a/activerecord/lib/arel/visitors/ibm_db.rb
+++ b/activerecord/lib/arel/visitors/ibm_db.rb
@@ -16,6 +16,15 @@ def is_distinct_from(o, collector)
           collector = visit [o.left, o.right, 0, 1], collector
           collector << ")"
         end
+
+        HINT_START       = "/* <OPTGUIDELINES>"  # :nodoc:
+        HINT_END         = "</OPTGUIDELINES> */" # :nodoc:
+        HINT_SEPARATOR   = ""                    # :nodoc:
+
+        def visit_Arel_Nodes_SelectCore(o, collector)
+          collector = super
+          maybe_visit o.optimizer_hints, collector
+        end
     end
   end
 end
diff --git a/activerecord/test/cases/arel/visitors/ibm_db_test.rb b/activerecord/test/cases/arel/visitors/ibm_db_test.rb
index 2ddbec3266..b75e51908a 100644
--- a/activerecord/test/cases/arel/visitors/ibm_db_test.rb
+++ b/activerecord/test/cases/arel/visitors/ibm_db_test.rb
@@ -68,6 +68,13 @@ def compile(node)
           sql.must_be_like %{ "users"."name" IS NOT NULL }
         end
       end
+
+      describe "Nodes::Annotation" do
+        it "should wrap a hint in multiline OPTGUIDELINES delimiters" do
+          node = Arel::Nodes::Annotation.new "omg", hint: true
+          compile(node).must_be_like %{ /\*\ <OPTGUIDELINES\>omg\</OPTGUIDELINES\> \*/ }
+        end
+      end
     end
   end
 end

There would still be a user-facing ActiveRecord::Relation#optimizer_hints API addition for adding hints to relations, and Arel::Nodes::SelectCore would still hold a distinct optimizer_hints property so that a visitor could choose where in the generated output to place the given hints.

I'm curious what you would think of an approach like this. Thank you!

/cc @jeremy @rafaelfranca

@kamipo
Copy link
Member Author

kamipo commented Mar 16, 2019

Thanks for your comment @mattyoho.

I agree that perhaps optimizer hints and comment (annotation) could share something code.

But for the SQL visitor, I'd prefer simplest separated visitor methods

        def visit_Arel_Nodes_Annotation(o, collector)
          collector << "/* #{o.expr.join(" ")} */"
        end

        def visit_Arel_Nodes_OptimizerHints(o, collector)
          collector << "/*+ #{o.expr.join(" ")} */"
        end

than one complex visitor method

        ANNOTATION_START = "/* "  # :nodoc:
        ANNOTATION_END   = " */"  # :nodoc:
        HINT_START       = "/*+ " # :nodoc:
        HINT_END         = " */"  # :nodoc:
        HINT_SEPARATOR   = SPACE  # :nodoc:

        def visit_Arel_Nodes_Annotation(o, collector)
          collector << (o.hint? ? self.class::HINT_START : self.class::ANNOTATION_START)
          o.values.each do |value|
            visit(value, collector)
            collector << self.class::HINT_SEPARATOR if o.hint?
          end
          collector << (o.hint? ? self.class::HINT_END : self.class::ANNOTATION_END)
        end

.

We as Arm Treasure Data are using Optimizer Hints with a monkey patch
(https://gist.github.com/kamipo/4c8539f0ce4acf85075cf5a6b0d9712e),
especially in order to use `MAX_EXECUTION_TIME` (refer rails#31129).

Example:

```ruby
class Job < ApplicationRecord
  default_scope { optimizer_hints("MAX_EXECUTION_TIME(50000) NO_INDEX_MERGE(jobs)") }
end
```

Optimizer Hints is supported not only for MySQL but also for most
databases (PostgreSQL on RDS, Oracle, SQL Server, etc), it is really
helpful to turn heavy queries for large scale applications.
@kamipo kamipo merged commit 1db0506 into rails:master Mar 16, 2019
@kamipo kamipo deleted the optimizer_hints branch March 16, 2019 18:53
@mattyoho
Copy link
Contributor

@kamipo wrote:

But for the SQL visitor, I'd prefer simplest separated visitor methods [...] than one complex visitor method

That's an excellent point. It's also probably much more in line with the visitor design to use inheritance with simple methods, as well.

Thank you for considering my comment.

yahonda added a commit to yahonda/oracle-enhanced that referenced this pull request Mar 17, 2019
rails/rails#35615

Here are actual explain output with or without hints.

Note: `>` means IRB prompt.

* With FULL hint

```
> post = TestPost.optimizer_hints("FULL (\"TEST_POSTS\")")
> post = post.where(id: 1)
> post.explain
=> EXPLAIN for: SELECT /*+ FULL ("TEST_POSTS") */ "TEST_POSTS".* FROM "TEST_POSTS" WHERE "TEST_POSTS"."ID" = :a1 [["id", 1]]
Plan hash value: 1913008706

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |    13 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_POSTS |     1 |    13 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEST_POSTS"."ID"=TO_NUMBER(:A1))
```

* Without hints

```ruby
> post2 = TestPost.where(id: 1)
):006:0> post2.explain
=> EXPLAIN for: SELECT "TEST_POSTS".* FROM "TEST_POSTS" WHERE "TEST_POSTS"."ID" = :a1 [["id", 1]]
Plan hash value: 2504341796

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |     1 |    13 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| SYS_C0011426 |     1 |    13 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TEST_POSTS"."ID"=TO_NUMBER(:A1))
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants