Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid plan for DISTINCT on user-defined datatype, #4529

Closed
hlinnaka opened this issue Feb 9, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@hlinnaka
Copy link
Member

commented Feb 9, 2018

ORCA generates an invalid plan, for this query with 'hstore':

-- Install 'hstore':
\i hstore.sql

postgres=# CREATE TABLE testhstore (h hstore);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause, and no column type is suitable for a distribution key. Creating a NULL policy entry.
CREATE TABLE
postgres=# insert into testhstore values ('');
INSERT 0 1
postgres=# set optimizer=on;
SET
postgres=# select count(distinct h) from testhstore ;
ERROR:  Type 16837 is not hashable.  (seg0 slice1 127.0.0.1:40000 pid=29901)
@hlinnaka

This comment has been minimized.

Copy link
Member Author

commented Feb 9, 2018

This was exposed by new regression tests that were just added in the 9.0 merge, along with new btree and hash operator classes for hstore. Even though this is a new issue with hstore, as it didn't have those operator classes before, in principle this is a pre-existing bug that could happen with any user-defined datatype. I think for this to happen, the datatype must have b-tree and hash opclasses, and it must not be "GPDB hashable", i.e. not a built-in type.

@hlinnaka

This comment has been minimized.

Copy link
Member Author

commented Feb 9, 2018

Here's the plan:

postgres=# explain select count(distinct h) from testhstore ;
                                                 QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=0.00..431.00 rows=1 width=8)
   ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..431.00 rows=1 width=5)
         ->  GroupAggregate  (cost=0.00..431.00 rows=1 width=5)
               Group Key: h
               ->  Sort  (cost=0.00..431.00 rows=1 width=5)
                     Sort Key: h
                     ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=5)
                           Hash Key: h
                           ->  GroupAggregate  (cost=0.00..431.00 rows=1 width=5)
                                 Group Key: h
                                 ->  Sort  (cost=0.00..431.00 rows=1 width=5)
                                       Sort Key: h
                                       ->  Table Scan on testhstore  (cost=0.00..431.00 rows=1 width=5)
 Optimizer: PQO version 2.54.2
(14 rows)

Should not create a Motion with a non-gpdb-hashable datatype in the Hash Key.

@vraghavan78

This comment has been minimized.

Copy link
Member

commented Feb 14, 2018

Thanks for reporting. Will take a look very soon.

@khannaekta

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

Issue fixed in GPORCA v2.55.10 (commit: greenplum-db/gporca@b31aa17)

@khannaekta khannaekta closed this Mar 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.