Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-3623: Use shorter query path for LIMIT 0 queries on schema-ed tables #193

Closed
wants to merge 1 commit into from

Conversation

sudheeshkatkam
Copy link
Contributor

Initial patch.

DrillTable#providesDeferredSchema function is used by the NonDeferredSchemaTableLimit0Visitor to check if the table can provide schema directly, and if so the result is directly returned.

It seems the shorter query path for this query needs a hacky "otherPlan" in the DefaultSqlHandler without major refactoring (Should I go ahead and make changes?). This also means that "EXPLAIN PLAN ..." returns a plan that is different the actual query plan (without a check in ExplainHandler, another hack).

I think the classes need more meaningful names (NonDeferredSchemaTableLimit0Visitor).

Also, note the type conversion using CALCITE_TO_DRILL_TYPE_MAPPING.

@@ -179,4 +179,9 @@ private void throwUnsupportedHiveDataTypeError(String hiveType) {

throw new RuntimeException(errMsg.toString());
}

@Override
public boolean providesDeferredSchema() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double negative here. How about
hasKnownSchema()

With javadoc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that there other places (the select * replacement) which depend on the same type of property. We should use a single method to determine this. @jinfengni , can you point Sudheesh in the right direction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe simply

isFullySchemaed()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think DrillHiveTable or other schema-aware table would return RelRecordType (in Calcite), when calling typeFactory.createStructType() as RowType.

For DrillDynamicTable, which is schema-less, the RowType would be RelDataTypeDrillImpl.

@sudheeshkatkam
Copy link
Contributor Author

@jinfengni please review.

}

@Override
public RelNode visit(TableScan scan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need override visitValues, since Value could be leaf operator as well, in addition to TableScan.

@jinfengni
Copy link
Contributor

Please modify the title of JIRA DRILL-3623, since the new pull request is using a completely different approach to address the performance issue for "LIMIT 0".

@jacques-n
Copy link
Contributor

What happened to the original strategy of short circuiting on schema'd files. This approach still means we have to pay for all the operation compilations for no reason.

@jinfengni
Copy link
Contributor

The original approach (skipping the execution phase for limit 0 completely), actually could potentially have issues in some cases, due to the difference in Calcite rule and Drill execution rule, in terms of how type is determined.

For example, sum(int) in calcite is resolved to be int, while in Drill execution, we changed to bigint. Another case is implicit cast. Currently, there are some small differences between Calcite and Drill execution. That means, if we skip the execution for limit 0, then types which are resolved in Calcite could be different from the type if the query goes through Drill execution. For BI tool like Tableau, that means the type returned from "limit 0" query and type from a second query w/o "limit 0" could be different.

If we want to avoid the above issues, we have to detect all those cases, which are painful. That's why Sudheesh and I are now more inclined to this new approach.

@jacques-n
Copy link
Contributor

Got it. Thanks for the explanation. So this is a hack until we can solve those issues.

I think we have to do this work, however. a 1-2 second response to a limit 0 query is too much. We should open up jiras for all of these inconsistency issues and then get Calcite and Drill in alignment.

What do you think we're talking about: aggregation outputs, implicit casting. What else?

@jinfengni
Copy link
Contributor

Sudheesh and I feel this new approach is more like a big optimization step towards solving the performance issue for "limit 0" query, rather than hack solution : 1) It shows quite significantly reduction in query time, from hundreds of seconds to couple of seconds in some cases. That's a big improvement. 2) it would benefit not only schema-based query, but also schema-less query, while the original approach would only apply for schema-based query.

I agree we should continue to optimize "limit 0" query. But for now, I think this new approach has its own merits.

The aggregation /implicit casting are the two things that I can think of, if we go with the schema-based approach.

@sudheeshkatkam
Copy link
Contributor Author

Also, on the execution side, the previous approach was hitting DRILL-2288, where sending exactly one batch with schema and without data is not handled correctly by various RecordBatches. With a fix for that issue, we could add further optimization for schemaed tables (i.e. add the previous implementation) with this implementation as the fallback.

@jacques-n
Copy link
Contributor

Interesting. Can you explain where the time is coming from? It isn't clear to me why this will have a big impact over what we had before. While you're pushing the limit down to just above the scan nodes, we already had an optimization which avoided parallelization. Since we're pipelined this really shouldn't matter much. Is limit zero not working right in the limit operator? It should terminate upon receiving schema, not wait until a batch of actual records (I'm wondering if it is doing the latter). Is sending zero records through causing operators to skip compilation? In what cases was this change taking something from hundreds of seconds to a few seconds? I'm asking these questions so I can better understand as I want to make sure there isn't a bug somewhere else. Thanks!

@sudheeshkatkam
Copy link
Contributor Author

I think I see the source of confusion (sorry); this patch does not address that query in the JIRA, which is why Jinfeng asked me to change the title in one of his comments. Regarding that query, DRILL-3921 helps avoid most of the execution time, but we still incur the planning time. And my initial approach address this issue but as mentioned above, this is blocked by DRILL-2288 and other things.

The new approach actually addresses any query that has a limit 0 above a blocking operator that consumes all records. And avoiding parallelization made the query much worse. (Actually, was fast-schema supposed to still kick in? Did not seem like it from my experiments.)

I tested against a query like SELECT * FROM (SELECT COUNT(DISTINCT a), COUNT(DISTINCT b), COUNT(DISTINCT c) FROM very_large_table) T LIMIT 0 and this completed two orders of magnitude faster.

@jacques-n
Copy link
Contributor

I'm sorry to say that I'm -1 on this change

It seems to be adding a planning rewrite rule where there should be a simple fix execution bug. Let's just fix the execution bug.

Limit 0 should complete its execution the moment it receives a schema (as part of fast schema). It doesn't need to receive any records. You just described a situation where it is waiting for records from a blocking operator. That shouldn't be the case. If there is some other real benefit to this change after that execution bug is fixed, let's revisit in that light.

If you think I'm misunderstanding your description of the execution behavior or the dynamics involved, please help me to better understand.

@jacques-n
Copy link
Contributor

Just to add to my comment above, if you want to do a quick call or hangout to discuss I'm more than happy to. As I said above, it is possible I am misunderstanding. If so, I'll definitely revise my objection.

@sudheeshkatkam
Copy link
Contributor Author

Updated the patch: for LIMIT 0 queries, short circuit before logical transformation if all field types are known for the root node.

* Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
* unlike {@link DrillScanRel}.
*/
public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use the VALUES operator instead? I think that Calcite already has the code to do this in the reduceexpressionsrule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With DrillDirectScanRel, the execution plan is: Screen <-- Project <-- DrillDirectScanRel. In my experiments (TPCDS queries), the total time did not cross 0.6 s.

With VALUES operator, there needs to be a LIMIT(0) operator on top. Then, the execution plan is: Screen <-- Project <-- SVR <-- Limit <-- Values. In my experiments, sometimes (seen twice) Project and SVR take ~0.5s to setup, and the query take 2s to complete, but this is not reproducible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I stick to the current impl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like that the DrillDirectScanRel approach works only when "limit 0" is on the root level; it would not apply when "limit 0" is inside a subquery.

The Values approach may apply for the case "limit 0" in a subquery. Certainly, this JIRA only targets the case of "limit 0" on root. But with the VALUES approach, there is room for improvement for a more general scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think DrillDirectScanRel is a wrong turn for this functionality, because you can't see that it is empty and reason about it. A DirectGroupScan is a runtime thing, so shouldn't be floating around at planning time.

I notice that DrillValuesRel does not extend Values. If it did, it would get all of the rules in PruneEmptyRules for free - pruning away Filter, Project etc. on top of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one challenge with the Values execution in Drill. We use data to encode types (and generate the vectors). It seems like the ideal would be expressing a values operation that has no data. Maybe we should just support a local limit in the values operator? That would allow us to bypass adding the limit(0) and sv remover for the simple case. Generally, we should probably support leaf node limit pushdown anyway. I see the new patch takes a different approach to the one above. One of the things that seemed to be an issue above is that the Limit operator was properly terminating its parents in the fast schema case of a limit 0. @sudheeshkatkam and @jinfengni, do you agree that is an issue? If it is, we should get a JIRA opened for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other note on the Calcite rule: it seems like we should just modify the Calcite mainline rule to avoid applying the zero records values operator optimization in the case that a column is not yet bound to a type (is an ANY column). That way we can stop maintaining our version of the rule. @jaltekruse, thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinfengni I'll post the patch with Values approach.

@julianhyde That makes sense.

@jacques-n Since we use data to encode types in Values, pushing Limit (specially zero) into into Values requires a sizable change (?). Also, I think there is a bug since Calcite allows for creating a LogicalValues with types and without literals, and Drill does not handle this case.

Regarding the above approach (putting a limit 0 on top of scan), I think fast schema wasn't sufficient because there was only one record batch during execution (extreme case). Right now, I do not see any specific issue there.

Regarding the note, can you expand on what you mean? Change the visitor pattern to a logical rule? Or is there a logical rule that conflicts with this change, and this shorter path should be avoided?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My note was in reference to DrillReduceExpressionsRule.

Basically, you could be able to modify this classes implementations of createEmptyRelOrEquivalent() to switch to a values (with fake data) operator followed by a limit(0). At least that was the thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getValuesRelIfFullySchemaed(...) check is done before logical transformation to avoid creating expensive objects while applying rules (unless rules are ordered). For example, DrillScanRule creates a DrillScanRel object which constructs a group scan object that can be quite expensive (see HiveScan, MongoGroupScan, HbaseGroupScan).

  the root logical node is known

+ DRILL-4043: Perform LIMIT 0 optimization before logical transformation
@sudheeshkatkam
Copy link
Contributor Author

Does the latest patch look fine? Also, please see the note in FindLimit0Visitor.

.build();
final RelTraitSet traits = rel.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
// TODO: ideally, we want the limit to be pushed into values
final DrillValuesRel values = new DrillValuesRel(rel.getCluster(), rel.getRowType(), tuples, traits);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FindLimit0Visitor.containsLimit0(relNode) is called when relNode is LogicalRel. Can we create LogicalValue with empty list of tuple, in stead of creating DrillValues with one tuple followed by a DrillLimit(0) ?

I think LogicalValue would allow an empty list of tuple, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to avoid logical transformation completely, as this is an exceptional case.

LogicalValues allows an empty list of tuples but as Jacques pointed out Drill does not handle that well (we use data to encode types).

@jinfengni
Copy link
Contributor

Overall, looks good to me. +1

@sudheeshkatkam
Copy link
Contributor Author

Moving to #405.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants