-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New feature] Federated SQL query and Query optimization are going to sail out. #8284
Comments
Maybe this issue is well for me: How to transform the parsed result of ShardingSphere to the algebra of Calcite. |
@guimingyue Welcome! |
Hi @junwen12221 , |
@tristaZero |
@junwen12221 That's great! BTW, recently one issue confused me a lot. Could I listen to your great idea?
My question is how to tell Calcite adaptor to use these optimized RelNodes? AFAIK, Driver.getConnection(CALCITE_URL) is the only way to trigger Calcite to use the custom adaptor, which will bring another SQL parsing from Calcite instead of using our optimized RelNode, doesn't it? How can we use the custom adaptor without calcite parser? Your GitHub activity tells me you are skillful at Calcite, anticipate your ideas. : ) |
Look at the example below
on line 127 ,you can replace the sql parser |
Hi @junwen12221 , Sorry for my late response, as I am writing a demo to implement the query optimization and federated SQL these days. BTW, what do you think Looking forward to your any innovative ideas. Best wishes, |
I'm glad that more open source projects use calcite to do data sharding based on SQL language. I don't care which project or community. This technology already exists in closed source software. Calcite converts SQL to
When the result of A conservative approach is to distinguish multiple converters or optimizers. In the first stage, when SQL generates relnode, use the built-in table interface and rules of calculate, and then try your own rules or do the next transformation in the second stage. If you know all the rules, you can perform all transformations in one phase. |
Before we start the work on this issue, we need to design the schema definition depending on calcite. Now, I'm investigating calcite's schema system。 |
@guimingyue +1, BTW. If your investigation makes progress, welcome your sharing here. A demo is a good way for discussion and presentation. |
Hi @junwen12221 , Very appreciated your explanation. Here is my understanding, Later, I tried to learn more about plan rules, which is a headache for me. For example, what's the difference between Worse still, the document of Calcite is too skimped to get the answer to these questions above. :( I wonder how you master this complicated tool? Are there any links or docs that can teach users more? BTW, do you mind exchanging weChat number? If it is possible, could you send your weChat number to Best, |
Generally, logical RelNode(
Then it goes through the (rule) converter to transform into the another RelNode with a new org.apache.calcite.plan.RelRule ,a new class recently added, in order to change the behavior of the rule through configuration. CoreRules also a new class recently added from refactor.Simply put, it sums up the rules of According to the same idea, we can guess the purpose of Despite all the above, they are actually some cases of |
Hi, @junwen12221
If Also, here are some of my thinking about this issue. If I missed something, welcome your correction. :)
|
My greetings for @junwen12221 @guimingyue . Based on the points @junwen12221 gave before, I write a demo for this issue, i.e., SQL federation and SQL query optimization (Looking forward to your collaborative effort) ✊ . In this demo, SQL federation using Calcite JDBC driver can work well. Nonetheless, I also write a raw executor (As 3 mentioned above) with parsing, validating, optimizing and executing, which you can view as a custom executor driver. Supposing this custom executor driver has a run-through process, we can replace Calcite parser with our SQL parser and add more plan rules to Unfortunately, this I tried any method I can, but failed, so I sincerely seek your any kind help and point! The exception info is presented later, also you can run The reason, I guess, is related to using the custom schema since it is different from this example new ReflectiveSchema(). But I have no approaches to debug or figure out the corresponding solutions. The only similar question I found is apache-calcite-querying-without-using-the-jdbc-api, FYI. 15:01:08.019 [main] DEBUG org.apache.calcite.plan.RelOptPlanner - Provenance:
rel#19:EnumerableCalc.ENUMERABLE(input=EnumerableCalc#18,expr#0..2={inputs},0=$t0)
direct
rel#17:EnumerableCalc.ENUMERABLE(input=RelSubset#16,expr#0..2={inputs},0=$t0)
call#9 rule [EnumerableCalcRule(in:NONE,out:ENUMERABLE)]
rel#14:LogicalCalc.NONE(input=RelSubset#6,expr#0..2={inputs},0=$t0)
call#5 rule [ProjectToCalcRule]
rel#7:LogicalProject.NONE(input=RelSubset#6,inputs=0)
no parent
rel#18:EnumerableCalc.ENUMERABLE(input=EnumerableTableScan#11,expr#0..2={inputs},expr#3=10,expr#4=<($t0, $t3),proj#0..2={exprs},$condition=$t4)
direct
rel#15:EnumerableCalc.ENUMERABLE(input=RelSubset#12,expr#0..2={inputs},expr#3=10,expr#4=<($t0, $t3),proj#0..2={exprs},$condition=$t4)
call#7 rule [EnumerableCalcRule(in:NONE,out:ENUMERABLE)]
rel#13:LogicalCalc.NONE(input=RelSubset#4,expr#0..2={inputs},expr#3=10,expr#4=<($t0, $t3),proj#0..2={exprs},$condition=$t4)
call#3 rule [FilterToCalcRule]
rel#5:LogicalFilter.NONE(input=RelSubset#4,condition=<($0, 10))
no parent
rel#11:EnumerableTableScan.ENUMERABLE(table=[sharding, t_order])
call#1 rule [EnumerableTableScanRule(in:NONE,out:ENUMERABLE)]
rel#1:LogicalTableScan.NONE(table=[sharding, t_order])
no parent
java.lang.NullPointerException
at Baz.bind(Unknown Source)
at federated.sql.executor.CalciteRawExecutor.execute(CalciteRawExecutor.java:184)
at federated.sql.executor.CalciteRawExecutor.execute(CalciteRawExecutor.java:173)
at federated.sql.executor.CalciteRawExecutorTest.assertSingleExecute(CalciteRawExecutorTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) |
I will check out this demo, and debug it tomorrow. |
@guimingyue @tristaZero |
@junwen12221 I add the VM options you mentioned in https://github.com/tristaZero/federatedSQL/pull/2,and I found the line throwed the NPE,It‘s |
Hi @guimingyue @junwen12221 , After adding VM opts, I found the program didn't enter the core part |
read it.
-Dorg.codehaus.janino.source_debugging.dir=federatedSQL\target\ target is maven target folder. If it doesn't work, you can copy the code of the function to replace the generated object for debugging。 |
CoreRules:Just rules that perform logical(not physical,Iterable) transformations on relational expressions.Generally used for RBO.This step mainly applies some heuristic rules, which are rule-based optimizer (RBO), so it is often called RBO stage. It is possible to implement the custom adaptor for federated queries. We need to consider SQL optimization. If we use Calcite JDBC Driver, the only way to affect the optimization process is to override the interface for different kinds of tables like TranslatableTable and provide RelOptRule. On the other hand, we may try to skip Calcite JDBC Driver and call the parse, validate, optimize and execute functions in our new custom execute Driver, which needs a in-depth understanding about the source code of Calcite and coding work. Calcite JDBC Driver provides some default advanced features including
It may maintain the function of calcite jdbc driver, but it is not JDBC interface. (adapter)https://calcite.apache.org/docs/adapter.html By 3, we can use our parser engine to parse SQL instead of Calcite parser, providing a broad SQL support as our SQL parser can parse SQLs from different databases dialects. On the whole, it can be achieved. |
Hi @junwen12221 ,
I suppose these features are far away from us since currently we just focus on the adaptor, rules optimization, and the conversion between parsed result and Speaking of When it comes to
Notice, The conversion from the SQLStatement of ShardingSphere parser to SqlNode of Calcite is a prerequisite for broad SQL support. Please be free to share your views on the content above. : ) BTW, I updated federatedSQL project with a non-generated-code converter to avoid @guimingyue How are you going? If you want to try to write something about query optimization, our federatedSQL repo is the right place, I guess. :) |
@tristaZero |
org.apache.calcite.schema.ProjectableFilterableTable,it is indeed a quick start interface. |
This commit add a new executor for executing physical plan of relational expression, and this executor is based on volcano model. The detail of this commit follows blow. * add a new package exec in , in this package, the main part is Executors,and they are the implementation of phycical RelNode. For example the NestedLoopJoinExecutor is the implemantation of physical RelNode SSNestedLoopJoin. * Some Executors need to execute function to filter or aggregate rows, so that's what BuiltinFunction does. When building Executor instance, BuiltinFunction may be created and passed to Executor instance, for example, org.apache.shardingsphere.infra.executor.exec.CalcExecutor#build create BuiltinFunction for filter operator to filter rows.
add a SqlNodeConverter class to convert ss ast to calcite ast.
* to #8284 add a SqlNodeConverter class to convert ss ast to calcite ast. * use ast SqlNodeConverter implementation to convert ast instead of all in one * refactor sql node converter * modify according to code style * refactor offset and row count sql node converter * fix code style checking error
Hello , this issue has not received a reply for several days. |
There hasn't been any activity on this issue recently, and in order to prioritize active issues, it will be marked as stale. |
Hi community,
As you know, ShardingSphere has made a lot of efforts on SQL parser and provided a great independent SQL parser to help users parse SQL.
Based on this substantial work, we plan to do query optimization to optimize the input SQLs from users and produce an optimized SQL query plan to improve query efficiency. Plus, the federated SQL query feature (Like join query from different instances) is another essential highlight for our next release. : )
We will leverage Apache Calcite, an excellent framework to implement two of the features. Currently, three main work focus are presented here.
TranslatableTable
API.Actually, there are plenty of works to do on this issue. We are in the investigation phase now and will seek contributors for this issue later. If you are interested in this one, please give it a watch. 😉
10th January 2021 Task Update
Hi, community,
Here is the progress update so far.
Functions
CalciteLogicSchemaFactory
,CalciteLogicSchema
, andCalciteFilterableTable
CalciteFilterableTable
(CalciteExecutionSQLGenerator) (Q2)CalciteRawExecutor
(EspeciallyResultSetMetadata
) (Q2)CalciteExecutor
(Q2)Unit test (Q2)
CalciteLogicSchemaFactory
#8965)CalciteJDBCExecutor
to test SQL federation feature #8883)SQL Federation
Scenario
The text was updated successfully, but these errors were encountered: