Skip to content

hash_join panics when join keys have different data types #2877

@andygrove

Description

@andygrove

Describe the bug

There are several places where hash_join can panic in version 9.0.0 when I run SQL queries.

thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value', /home/andy/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-9.0.0/src/physical_plan/hash_join.rs:973:17
thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value', /home/andy/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-9.0.0/src/physical_plan/hash_join.rs:979:17
thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value', /home/andy/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-9.0.0/src/physical_plan/hash_join.rs:1045:17

To Reproduce
I am experimenting with SQL query fuzzing. Here is one query that failed:

SELECT _c438, _c439, _c440, _c441, _c442, _c443
FROM (
  (
    (SELECT test1.c0 AS _c438, test1.c1 AS _c439, test1.c2 AS _c440, test1.c3 AS _c441, test1.c4 AS _c442, test1.c5 AS _c443
      FROM (test1))
    FULL JOIN
    (SELECT test1.c0 AS _c444, test1.c1 AS _c445, test1.c2 AS _c446, test1.c3 AS _c447, test1.c4 AS _c448, test1.c5 AS _c449
      FROM (test1))
    ON _c438 = _c446)
  RIGHT JOIN
  (SELECT test0.c0 AS _c450, test0.c1 AS _c451, test0.c2 AS _c452, test0.c3 AS _c453, test0.c4 AS _c454, test0.c5 AS _c455
    FROM (test0))
  ON _c441 = _c451);
ArrowError(ExternalError("Arrow error: External error: Execution error: Join Error: task 5672 panicked"))

Data files are here - https://github.com/andygrove/sqlfuzz/tree/main/testdata

Expected behavior
Should not panic

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions