When use muti data source join query join condition in where clause failed cause by missing conversions #2708

Javelin2007 · 2022-11-18T08:11:55Z

The version information is as follows:

    apache drill> select commit_message, commit_time from sys.version;
    +----------------------------------------------------------------------------------+---------------------------+
    |                                  commit_message                                  |        commit_time        |
    +----------------------------------------------------------------------------------+---------------------------+
    | DRILL-8357: Add new config options to the Splunk storage plugin (extra docs) (#2706) | 15.11.2022 @ 20:34:55 CST |
    +----------------------------------------------------------------------------------+---------------------------+
    1 row selected (0.133 seconds)

Create tables and insert data in postgresql and mysql database

    create table t1(col1 int);
    insert into t1 values(1), (2);

Create mysql and postgresql Plugins in Storage label via http://localhost:8047/storage pages
The sql statement is executed as follows

  apache drill> select * from pgsql.t1 as pg ,my.test.t1 as my where pg.col1 = my.col1 and  pg.col1 = 1 and my.col1 =1 ;
  Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join. 
  If a cartesian or inequality join is used intentionally, set the option 'planner.enable_nljoin_for_scalar_only' to false and try again.
  
  
  [Error Id: 91583f69-5aa6-44d5-a29c-63e2358932ea ] (state=,code=0)
  apache drill> set planner.enable_nljoin_for_scalar_only=false;
  +------+------------------------------------------------+
  |  ok  |                    summary                     |
  +------+------------------------------------------------+
  | true | planner.enable_nljoin_for_scalar_only updated. |
  +------+------------------------------------------------+
  1 row selected (0.212 seconds)
  apache drill> select * from pgsql.t1 as pg ,my.test.t1 as my where pg.col1 = my.col1 and  pg.col1 = 1 and my.col1 =1 ;
  Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join. 
  If a cartesian or inequality join is used intentionally, set the option 'planner.enable_nljoin_for_scalar_only' to false and try again.
  
  
  [Error Id: 37d0dbca-40d1-4de5-9443-b48ce3a172c0 ] (state=,code=0)
  apache drill>

The below is drillbit.log info

  2022-11-18 08:08:01,169 [1c88c29e-3040-efc1-46a6-33152cfbab32:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id 1c88c29e-3040-efc1-46a6-33152cfbab32 issued by test: select * from pgsql.t1 as pg ,my.test.t1 as my where pg.col1 = my.col1 and  pg.col1 = 1 and my.col1 =1 
  2022-11-18 08:08:01,258 [1c88c29e-3040-efc1-46a6-33152cfbab32:foreman] ERROR o.a.d.e.p.s.h.DefaultSqlHandler - There are not enough rules to produce a node with desired properties: convention=PHYSICAL, DrillDistributionTraitDef=SINGLETON([]), sort=[].
  Missing conversions are JdbcFilter[convention: JDBC.pgsql -> JDBC.my], JdbcFilter[convention: JDBC.my -> JDBC.pgsql]
  There are 2 empty subsets:
  Empty subset 0: rel#15241:RelSubset#12.JDBC.my.ANY([]).[], the relevant part of the original plan is as follows
  15211:JdbcFilter(condition=[=($0, 1)])
    14914:JdbcTableScan(subset=[rel#15210:RelSubset#11.JDBC.pgsql.ANY([]).[]], table=[[pgsql, t1]])
  
  Empty subset 1: rel#15247:RelSubset#15.JDBC.pgsql.ANY([]).[], the relevant part of the original plan is as follows
  15216:JdbcFilter(condition=[=($0, 1)])
    14915:JdbcTableScan(subset=[rel#15215:RelSubset#14.JDBC.my.ANY([]).[]], table=[[my, test, t1]])
  
  Root: rel#15224:RelSubset#18.PHYSICAL.SINGLETON([]).[]
  Original rel:
  LogicalProject(subset=[rel#14956:RelSubset#4.LOGICAL.ANY([]).[]], col1=[$0], col10=[$1]): rowcount = 3.375E15, cumulative cost = {3.375E15 rows, 6.75E15 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14954
    LogicalFilter(subset=[rel#14953:RelSubset#3.NONE.ANY([]).[]], condition=[AND(=($0, $1), =($0, 1), =($1, 1))]): rowcount = 3.375E15, cumulative cost = {3.375E15 rows, 1.0E18 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14952
      LogicalJoin(subset=[rel#14951:RelSubset#2.NONE.ANY([]).[]], condition=[true], joinType=[inner]): rowcount = 1.0E18, cumulative cost = {1.0E18 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14950
        JdbcTableScan(subset=[rel#14948:RelSubset#0.JDBC.pgsql.ANY([]).[]], table=[[pgsql, t1]]): rowcount = 1.0E9, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14914
        JdbcTableScan(subset=[rel#14949:RelSubset#1.JDBC.my.ANY([]).[]], table=[[my, test, t1]]): rowcount = 1.0E9, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14915
  
  Sets:
  Set#11, type: RecordType(INTEGER col1)
	  rel#15210:RelSubset#11.JDBC.pgsql.ANY([]).[], best=rel#14914
		  rel#14914:JdbcTableScan.JDBC.pgsql.ANY([]).[](table=[pgsql, t1]), rowcount=1.0E9, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15227:RelSubset#11.LOGICAL.ANY([]).[], best=rel#15226
		  rel#15226:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15210), rowcount=1.0E9, cumulative cost={1.000001E8 rows, 1.00000101E8 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15254:RelSubset#11.PHYSICAL.SINGLETON([]).[], best=rel#15253
		  rel#15253:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15210), rowcount=1.0E9, cumulative cost={1.0000001E9 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
  Set#12, type: RecordType(INTEGER col1)
	  rel#15212:RelSubset#12.JDBC.pgsql.ANY([]).[], best=rel#15211
		  rel#15211:JdbcFilter.JDBC.pgsql.ANY([]).[](input=RelSubset#15210,condition==($0, 1)), rowcount=1.5E8, cumulative cost={1.500001E8 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15246:AbstractConverter.JDBC.pgsql.ANY([]).[](input=RelSubset#15232,convention=JDBC.pgsql,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
	  rel#15230:RelSubset#12.LOGICAL.ANY([]).[], best=rel#15213
		  rel#15213:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15212), rowcount=1.5E8, cumulative cost={1.650001E8 rows, 1.015000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15258:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15241), rowcount=1.5E8, cumulative cost={inf}
	  rel#15232:RelSubset#12.PHYSICAL.SINGLETON([]).[], best=rel#15231
		  rel#15231:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15212), rowcount=1.5E8, cumulative cost={3.000001E8 rows, 1.150000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15271:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15241), rowcount=1.5E8, cumulative cost={inf}
	  rel#15241:RelSubset#12.JDBC.my.ANY([]).[], best=null
		  rel#15242:AbstractConverter.JDBC.my.ANY([]).[](input=RelSubset#15232,convention=JDBC.my,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
  Set#14, type: RecordType(INTEGER col1)
	  rel#15215:RelSubset#14.JDBC.my.ANY([]).[], best=rel#14915
		  rel#14915:JdbcTableScan.JDBC.my.ANY([]).[](table=[my, test, t1]), rowcount=1.0E9, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15235:RelSubset#14.LOGICAL.ANY([]).[], best=rel#15234
		  rel#15234:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15215), rowcount=1.0E9, cumulative cost={1.000001E8 rows, 1.00000101E8 cpu, 0.0 io, 0.0 network, 0.0 memory}
	  rel#15256:RelSubset#14.PHYSICAL.SINGLETON([]).[], best=rel#15255
		  rel#15255:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15215), rowcount=1.0E9, cumulative cost={1.0000001E9 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
  Set#15, type: RecordType(INTEGER col1)
	  rel#15217:RelSubset#15.JDBC.my.ANY([]).[], best=rel#15216
		  rel#15216:JdbcFilter.JDBC.my.ANY([]).[](input=RelSubset#15215,condition==($0, 1)), rowcount=1.5E8, cumulative cost={1.500001E8 rows, 1.000000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15243:AbstractConverter.JDBC.my.ANY([]).[](input=RelSubset#15240,convention=JDBC.my,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
	  rel#15238:RelSubset#15.LOGICAL.ANY([]).[], best=rel#15218
		  rel#15218:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15217), rowcount=1.5E8, cumulative cost={1.650001E8 rows, 1.015000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15268:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15247), rowcount=1.5E8, cumulative cost={inf}
	  rel#15240:RelSubset#15.PHYSICAL.SINGLETON([]).[], best=rel#15239
		  rel#15239:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15217), rowcount=1.5E8, cumulative cost={3.000001E8 rows, 1.150000101E9 cpu, 0.0 io, 0.0 network, 0.0 memory}
		  rel#15274:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15247), rowcount=1.5E8, cumulative cost={inf}
	  rel#15247:RelSubset#15.JDBC.pgsql.ANY([]).[], best=null
		  rel#15248:AbstractConverter.JDBC.pgsql.ANY([]).[](input=RelSubset#15240,convention=JDBC.pgsql,DrillDistributionTraitDef=ANY([]),sort=[]), rowcount=1.5E8, cumulative cost={inf}
  Set#17, type: RecordType(INTEGER col1, INTEGER col10)
	  rel#15221:RelSubset#17.LOGICAL.ANY([]).[], best=rel#15220
		  rel#15220:DrillJoinRel.LOGICAL.ANY([]).[](left=RelSubset#15230,right=RelSubset#15238,condition=true,joinType=inner), rowcount=2.25E16, cumulative cost={6.300002E8 rows, 2.030000202E9 cpu, 0.0 io, 0.0 network, 1.32E9 memory}
		  rel#15263:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15245), rowcount=2.25E16, cumulative cost={inf}
		  rel#15270:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#15250), rowcount=2.25E16, cumulative cost={inf}
	  rel#15245:RelSubset#17.JDBC.my.ANY([]).[], best=null
		  rel#15244:JdbcJoin.JDBC.my.ANY([]).[](left=RelSubset#15241,right=RelSubset#15217,condition=true,joinType=inner), rowcount=1.5E8, cumulative cost={inf}
	  rel#15250:RelSubset#17.JDBC.pgsql.ANY([]).[], best=null
		  rel#15249:JdbcJoin.JDBC.pgsql.ANY([]).[](left=RelSubset#15212,right=RelSubset#15247,condition=true,joinType=inner), rowcount=1.5E8, cumulative cost={inf}
	  rel#15251:RelSubset#17.PHYSICAL.SINGLETON([]).[], best=null
		  rel#15273:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15245), rowcount=2.25E16, cumulative cost={inf}
		  rel#15276:JdbcIntermediatePrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15250), rowcount=2.25E16, cumulative cost={inf}
  Set#18, type: RecordType(INTEGER col1, INTEGER col10)
	  rel#15223:RelSubset#18.LOGICAL.ANY([]).[], best=rel#15222
		  rel#15222:DrillScreenRel.LOGICAL.ANY([]).[](input=RelSubset#15221), rowcount=2.25E16, cumulative cost={2.2500006300002E15 rows, 2.250002030000202E15 cpu, 0.0 io, 0.0 network, 1.32E9 memory}
	  rel#15224:RelSubset#18.PHYSICAL.SINGLETON([]).[], best=null
		  rel#15252:ScreenPrel.PHYSICAL.SINGLETON([]).[](input=RelSubset#15251), rowcount=2.25E16, cumulative cost={inf}

When i modify query use inner/right/full join, the query run normally

  apache drill> select * from pgsql.t1 as pg inner join my.wubq.t1 as my on pg.col1 = my.col1 where pg.col1 = 1 and my.col1 =1 ;
  +------+-------+
  | col1 | col10 |
  +------+-------+
  | 1    | 1     |
  +------+-------+
  1 row selected (0.644 seconds)
  apache drill>

The text was updated successfully, but these errors were encountered:

cgivre · 2022-11-18T13:19:42Z

Is there some reason you can't write the query using a standard join type?

Javelin2007 · 2022-11-21T01:14:47Z

I just test the case, i also used standard join, but i think use where clause should be wrong.

cgivre closed this as completed Feb 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When use muti data source join query join condition in where clause failed cause by missing conversions #2708

When use muti data source join query join condition in where clause failed cause by missing conversions #2708

Javelin2007 commented Nov 18, 2022 •

edited by vvysotskyi

cgivre commented Nov 18, 2022

Javelin2007 commented Nov 21, 2022

When use muti data source join query join condition in where clause failed cause by missing conversions #2708

When use muti data source join query join condition in where clause failed cause by missing conversions #2708

Comments

Javelin2007 commented Nov 18, 2022 • edited by vvysotskyi

cgivre commented Nov 18, 2022

Javelin2007 commented Nov 21, 2022

Javelin2007 commented Nov 18, 2022 •

edited by vvysotskyi