Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When use muti data source join query join condition in where clause failed cause by missing conversions #2708

Closed
Javelin2007 opened this issue Nov 18, 2022 · 2 comments

Comments

@Javelin2007
Copy link

Javelin2007 commented Nov 18, 2022

  1. The version information is as follows:
    apache drill> select commit_message, commit_time from sys.version;
    +----------------------------------------------------------------------------------+---------------------------+
    |                                  commit_message                                  |        commit_time        |
    +----------------------------------------------------------------------------------+---------------------------+
    | DRILL-8357: Add new config options to the Splunk storage plugin (extra docs) (#2706) | 15.11.2022 @ 20:34:55 CST |
    +----------------------------------------------------------------------------------+---------------------------+
    1 row selected (0.133 seconds)
  1. Create tables and insert data in postgresql and mysql database
    create table t1(col1 int);
    insert into t1 values(1), (2);
  1. Create mysql and postgresql Plugins in Storage label via http://localhost:8047/storage pages

  2. The sql statement is executed as follows

  apache drill> select * from pgsql.t1 as pg ,my.test.t1 as my where pg.col1 = my.col1 and  pg.col1 = 1 and my.col1 =1 ;
  Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join. 
  If a cartesian or inequality join is used intentionally, set the option 'planner.enable_nljoin_for_scalar_only' to false and try again.
  
  
  [Error Id: 91583f69-5aa6-44d5-a29c-63e2358932ea ] (state=,code=0)
  apache drill> set planner.enable_nljoin_for_scalar_only=false;
  +------+------------------------------------------------+
  |  ok  |                    summary                     |
  +------+------------------------------------------------+
  | true | planner.enable_nljoin_for_scalar_only updated. |
  +------+------------------------------------------------+
  1 row selected (0.212 seconds)
  apache drill> select * from pgsql.t1 as pg ,my.test.t1 as my where pg.col1 = my.col1 and  pg.col1 = 1 and my.col1 =1 ;
  Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join. 
  If a cartesian or inequality join is used intentionally, set the option 'planner.enable_nljoin_for_scalar_only' to false and try again.
  
  
  [Error Id: 37d0dbca-40d1-4de5-9443-b48ce3a172c0 ] (state=,code=0)
  apache drill> 
  1. The below is drillbit.log info
  2022-11-18 08:08:01,169 [1c88c29e-3040-efc1-46a6-33152cfbab32:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id 1c88c29e-3040-efc1-46a6-33152cfbab32 issued by test: select * from pgsql.t1 as pg ,my.test.t1 as my where pg.col1 = my.col1 and  pg.col1 = 1 and my.col1 =1 
  2022-11-18 08:08:01,258 [1c88c29e-3040-efc1-46a6-33152cfbab32:foreman] ERROR o.a.d.e.p.s.h.DefaultSqlHandler - There are not enough rules to produce a node with desired properties: convention=PHYSICAL, DrillDistributionTraitDef=SINGLETON([]), sort=[].
  Missing conversions are JdbcFilter[convention: JDBC.pgsql -> JDBC.my], JdbcFilter[convention: JDBC.my -> JDBC.pgsql]
  There are 2 empty subsets:
  Empty subset 0: rel#15241:RelSubset#12.JDBC.my.ANY([]).[], the relevant part of the original plan is as follows
  15211:JdbcFilter(condition=[=($0, 1)])
    14914:JdbcTableScan(subset=[rel#15210:RelSubset#11.JDBC.pgsql.ANY([]).[]], table=[[pgsql, t1]])
  
  Empty subset 1: rel#15247:RelSubset#15.JDBC.pgsql.ANY([]).[], the relevant part of the original plan is as follows
  15216:JdbcFilter(condition=[=($0, 1)])
    14915:JdbcTableScan(subset=[rel#15215:RelSubset#14.JDBC.my.ANY([]).[]], table=[[my, test, t1]])
  
  Root: rel#15224:RelSubset#18.PHYSICAL.SINGLETON([]).[]
  Original rel:
  LogicalProject(subset=[rel#14956:RelSubset#4.LOGICAL.ANY([]).[]], col1=[$0], col10=[$1]): rowcount = 3.375E15, cumulative cost = {3.375E15 rows, 6.75E15 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14954
    LogicalFilter(subset=[rel#14953:RelSubset#3.NONE.ANY([]).[]], condition=[AND(=($0, $1), =($0, 1), =($1, 1))]): rowcount = 3.375E15, cumulative cost = {3.375E15 rows, 1.0E18 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14952
      LogicalJoin(subset=[rel#14951:RelSubset#2.NONE.ANY([]).[]], condition=[true], joinType=[inner]): rowcount = 1.0E18, cumulative cost = {1.0E18 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14950
        JdbcTableScan(subset=[rel#14948:RelSubset#0.JDBC.pgsql.ANY([]).[]], table=[[pgsql, t1]]): rowcount = 1.0E9, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14914
        JdbcTableScan(subset=[rel#14949:RelSubset#1.JDBC.my.ANY([]).[]], table=[[my, test, t1]]): rowcount = 1.0E9, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14915
  
  Sets:
  Set#11, type: RecordType(INTEGER col1)
	  rel#15210:RelSubset#11.JDBC.pgsql.ANY([]).[], best=rel#14914
		  rel#14914:JdbcTableScan.JDBC.pgsql.ANY([]).[](table=[pgsql, t1]), rowcount=1.0E9, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15227:RelSubset#11.LOGICAL.ANY([]).[], best=rel#15226
		  rel#15226:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15210), rowcount=1.0E9, cumulative cost={1.000001E8 rows, 1.00000101E8 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15254:RelSubset#11.PHYSICAL.SINGLETON([]).[], best=rel#15253
		  rel#15253:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15210), rowcount=1.0E9, cumulative cost={1.0000001E9 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
  Set#12, type: RecordType(INTEGER col1)
	  rel#15212:RelSubset#12.JDBC.pgsql.ANY([]).[], best=rel#15211
		  rel#15211:JdbcFilter.JDBC.pgsql.ANY([]).[](input=RelSubset#15210,condition==($0, 1)), rowcount=1.5E8, cumulative cost={1.500001E8 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15246:AbstractConverter.JDBC.pgsql.ANY([]).[](input=RelSubset#15232,convention=JDBC.pgsql,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
	  rel#15230:RelSubset#12.LOGICAL.ANY([]).[], best=rel#15213
		  rel#15213:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15212), rowcount=1.5E8, cumulative cost={1.650001E8 rows, 1.015000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15258:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15241), rowcount=1.5E8, cumulative cost={inf}
	  rel#15232:RelSubset#12.PHYSICAL.SINGLETON([]).[], best=rel#15231
		  rel#15231:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15212), rowcount=1.5E8, cumulative cost={3.000001E8 rows, 1.150000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15271:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15241), rowcount=1.5E8, cumulative cost={inf}
	  rel#15241:RelSubset#12.JDBC.my.ANY([]).[], best=null
		  rel#15242:AbstractConverter.JDBC.my.ANY([]).[](input=RelSubset#15232,convention=JDBC.my,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
  Set#14, type: RecordType(INTEGER col1)
	  rel#15215:RelSubset#14.JDBC.my.ANY([]).[], best=rel#14915
		  rel#14915:JdbcTableScan.JDBC.my.ANY([]).[](table=[my, test, t1]), rowcount=1.0E9, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15235:RelSubset#14.LOGICAL.ANY([]).[], best=rel#15234
		  rel#15234:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15215), rowcount=1.0E9, cumulative cost={1.000001E8 rows, 1.00000101E8 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15256:RelSubset#14.PHYSICAL.SINGLETON([]).[], best=rel#15255
		  rel#15255:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15215), rowcount=1.0E9, cumulative cost={1.0000001E9 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
  Set#15, type: RecordType(INTEGER col1)
	  rel#15217:RelSubset#15.JDBC.my.ANY([]).[], best=rel#15216
		  rel#15216:JdbcFilter.JDBC.my.ANY([]).[](input=RelSubset#15215,condition==($0, 1)), rowcount=1.5E8, cumulative cost={1.500001E8 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15243:AbstractConverter.JDBC.my.ANY([]).[](input=RelSubset#15240,convention=JDBC.my,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
	  rel#15238:RelSubset#15.LOGICAL.ANY([]).[], best=rel#15218
		  rel#15218:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15217), rowcount=1.5E8, cumulative cost={1.650001E8 rows, 1.015000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15268:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15247), rowcount=1.5E8, cumulative cost={inf}
	  rel#15240:RelSubset#15.PHYSICAL.SINGLETON([]).[], best=rel#15239
		  rel#15239:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15217), rowcount=1.5E8, cumulative cost={3.000001E8 rows, 1.150000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15274:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15247), rowcount=1.5E8, cumulative cost={inf}
	  rel#15247:RelSubset#15.JDBC.pgsql.ANY([]).[], best=null
		  rel#15248:AbstractConverter.JDBC.pgsql.ANY([]).[](input=RelSubset#15240,convention=JDBC.pgsql,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
  Set#17, type: RecordType(INTEGER col1, INTEGER col10)
	  rel#15221:RelSubset#17.LOGICAL.ANY([]).[], best=rel#15220
		  rel#15220:DrillJoinRel.LOGICAL.ANY([]).[](left=RelSubset#15230,right=RelSubset#15238,condition=true,joinType=inner), rowcount=2.25E16, cumulative cost={6.300002E8 rows, 2.030000202E9 cpu, 0.0 io, 0.0 network, 1.32E9 memory}
		  rel#15263:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15245), rowcount=2.25E16, cumulative cost={inf}
		  rel#15270:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15250), rowcount=2.25E16, cumulative cost={inf}
	  rel#15245:RelSubset#17.JDBC.my.ANY([]).[], best=null
		  rel#15244:JdbcJoin.JDBC.my.ANY([]).[](left=RelSubset#15241,right=RelSubset#15217,condition=true,joinType=inner), rowcount=1.5E8, cumulative cost={inf}
	  rel#15250:RelSubset#17.JDBC.pgsql.ANY([]).[], best=null
		  rel#15249:JdbcJoin.JDBC.pgsql.ANY([]).[](left=RelSubset#15212,right=RelSubset#15247,condition=true,joinType=inner), rowcount=1.5E8, cumulative cost={inf}
	  rel#15251:RelSubset#17.PHYSICAL.SINGLETON([]).[], best=null
		  rel#15273:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15245), rowcount=2.25E16, cumulative cost={inf}
		  rel#15276:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15250), rowcount=2.25E16, cumulative cost={inf}
  Set#18, type: RecordType(INTEGER col1, INTEGER col10)
	  rel#15223:RelSubset#18.LOGICAL.ANY([]).[], best=rel#15222
		  rel#15222:DrillScreenRel.LOGICAL.ANY([]).[](input=RelSubset#15221), rowcount=2.25E16, cumulative cost={2.2500006300002E15 rows, 2.250002030000202E15 cpu, 0.0 io, 0.0 network, 1.32E9 memory}
	  rel#15224:RelSubset#18.PHYSICAL.SINGLETON([]).[], best=null
		  rel#15252:ScreenPrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15251), rowcount=2.25E16, cumulative cost={inf}
  1. When i modify query use inner/right/full join, the query run normally
  apache drill> select * from pgsql.t1 as pg inner join my.wubq.t1 as my on pg.col1 = my.col1 where pg.col1 = 1 and my.col1 =1 ;
  +------+-------+
  | col1 | col10 |
  +------+-------+
  | 1    | 1     |
  +------+-------+
  1 row selected (0.644 seconds)
  apache drill> 
@cgivre
Copy link
Contributor

cgivre commented Nov 18, 2022

Is there some reason you can't write the query using a standard join type?

@Javelin2007
Copy link
Author

I just test the case, i also used standard join, but i think use where clause should be wrong.

@cgivre cgivre closed this as completed Feb 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants