Skip to content

Add ambiguous check when generate join plan #4197

@ygf11

Description

@ygf11

Describe the bug
For some join sql whose meaning is not clear, datafusion will not throw error.
For example, test0 and test1 both has c0 column.

❯ explain select * from test0 as t0 inner join test0 as t1 on c0 = c0;
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                                                    |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: t0.c0, t0.c1, t0.c2, t0.c3, t0.c4, t0.c5, t1.c0, t1.c1, t1.c2, t1.c3, t1.c4, t1.c5                                                                          |
|               |   Inner Join: t0.c0 = t1.c0                                                                                                                                             |
|               |     SubqueryAlias: t0                                                                                                                                                   |
|               |       TableScan: test0 projection=[c0, c1, c2, c3, c4, c5]                                                                                                              |
|               |     SubqueryAlias: t1                                                                                                                                                   |
|               |       TableScan: test0 projection=[c0, c1, c2, c3, c4, c5]                                                                                                              |

In PostgreSQL, this will throw an error:

psql -d "$POSTGRES_DB" -h "$POSTGRES_HOST" -p "$POSTGRES_PORT" -U "$POSTGRES_USER" -c "explain select * from test0 as t0 inner join test0 as t1 on c0 = c0;"
ERROR:  column reference "c0" is ambiguous
LINE 1: ...select * from test0 as t0 inner join test0 as t1 on c0 = c0;
                                                               ^

The semantics of this sql is not clear, we need throw exception like PostgreSQL do.

To Reproduce
test0 and test1 both has c0 column.

Then run explain select * from test0 as t0 inner join test0 as t1 on c0 = c0;.

Expected behavior

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions