Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicate push down bug #13565

Closed
lzeiming opened this issue Aug 9, 2022 · 5 comments · Fixed by #13758
Closed

Predicate push down bug #13565

lzeiming opened this issue Aug 9, 2022 · 5 comments · Fixed by #13758
Labels
bug Something isn't working

Comments

@lzeiming
Copy link
Contributor

lzeiming commented Aug 9, 2022

I found the issue in versions 364,380, 392


use hive.test;

create table t1
(
    c1 varchar
);
create table t2
(
    c1 varchar
);

create view v1 as
select *
from t1
union all
select *
from t2;

insert into t1(c1) values ('1');

create table t3
(
    a varchar,
    b varchar
);

select a
from (
         select a, b from t3
         union all
         select null as a, null as b
         from (
                  SELECT bb.c1
                  FROM (
                           SELECT c1

                           FROM v1
                       ) aa
                           JOIN (
                      SELECT c1
                      FROM v1
                  ) bb
                                ON aa.c1 = bb.c1
              )
     )
where cast(a as timestamp) >= timestamp '2022-08-01 00:00:00'
  and b IN ('hello') ;

Query 20220809_115635_06050_yjfzd failed: Value cannot be cast to timestamp: hello

I find the unexpected filter in the plan:
(CAST(VARCHAR 'hello' AS timestamp) >= TIMESTAMP '2022-08-01 00:00:00.000'))

@findepi
Copy link
Member

findepi commented Aug 9, 2022

Reproducible on current master also with the memory connector

USE memory.default;

DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
DROP TABLE IF EXISTS t3;
DROP VIEW IF EXISTS v1;

CREATE TABLE t1 (c1 varchar);
CREATE TABLE t2 (c1 varchar);
CREATE TABLE t3 (a varchar, b varchar);

CREATE VIEW v1 AS
 SELECT * FROM t1 UNION ALL SELECT * FROM t2;

INSERT INTO t1 VALUES '1';

SELECT a
FROM (
    SELECT a, b FROM t3
    UNION ALL
    SELECT NULL AS a, NULL AS b FROM (
        SELECT bb.c1
        FROM (SELECT c1 FROM v1) aa
        JOIN (SELECT c1 FROM v1) bb ON aa.c1 = bb.c1
    )
)
WHERE CAST(a AS TIMESTAMP) >= timestamp '2022-08-01 00:00:00'
AND b IN ('hello');  -- query also fails with `= 'hello'` here
USE
DROP TABLE
DROP TABLE
DROP TABLE
DROP VIEW
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE VIEW
INSERT: 1 row

Query 20220809_144340_00075_sftar, FINISHED, 1 node
http://localhost:8080/ui/query.html?20220809_144340_00075_sftar
Splits: 35 total, 35 done (100.00%)
CPU Time: 0.0s total,     0 rows/s,     0B/s, 80% active
Per Node: 0.0 parallelism,     0 rows/s,     0B/s
Parallelism: 0.0
Peak Memory: 201B
0.13 [0 rows, 0B] [0 rows/s, 0B/s]


Query 20220809_144340_00076_sftar, FAILED, 1 node
http://localhost:8080/ui/query.html?20220809_144340_00076_sftar
Splits: 87 total, 20 done (22.99%)
CPU Time: 0.0s total,     0 rows/s,     0B/s, 60% active
Per Node: 0.0 parallelism,     0 rows/s,     0B/s
Parallelism: 0.0
Peak Memory: 0B
0.07 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20220809_144340_00076_sftar failed: Value cannot be cast to timestamp: hello
io.trino.spi.TrinoException: Value cannot be cast to timestamp: hello
	at io.trino.operator.scalar.timestamp.VarcharToTimestampCast.castToShortTimestamp(VarcharToTimestampCast.java:78)
	at io.trino.operator.scalar.timestamp.VarcharToTimestampCast.castToShort(VarcharToTimestampCast.java:52)
	at io.trino.$gen.PageFilter_20220809_143959_45.filter(Unknown Source)
	at io.trino.$gen.PageFilter_20220809_143959_45.filter(Unknown Source)
	at io.trino.operator.project.PageProcessor.createWorkProcessor(PageProcessor.java:120)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.lambda$processPageSource$1(ScanFilterAndProjectOperator.java:295)
	at io.trino.operator.WorkProcessorUtils.lambda$flatMap$5(WorkProcessorUtils.java:264)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:338)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils.lambda$flatten$7(WorkProcessorUtils.java:296)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:338)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:240)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$3(WorkProcessorUtils.java:219)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:240)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$4(WorkProcessorUtils.java:234)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:150)
	at io.trino.operator.Driver.processInternal(Driver.java:410)
	at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
	at io.trino.operator.Driver.tryWithLock(Driver.java:698)
	at io.trino.operator.Driver.process(Driver.java:305)
	at io.trino.operator.Driver.processForDuration(Driver.java:276)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:740)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:164)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:490)
	at io.trino.$gen.Trino_dev____20220809_141716_2.run(Unknown Source)

cc @sopel39 @martint

@findepi findepi added the bug Something isn't working label Aug 9, 2022
@martint
Copy link
Member

martint commented Aug 19, 2022

It's happening because during predicate pushdown, the following predicate is derived:

((CAST(CAST(null AS varchar) AS timestamp) >= TIMESTAMP '2022-08-01 00:00:00.000') AND (CAST(null AS varchar) = VARCHAR 'hello'))

Then, it incorrectly establishes the equivalence CAST(null AS varchar) = VARCHAR 'hello', which it then uses to rewrite the expression by replacing occurrences of CAST(null AS varchar) with VARCHAR 'hello'.

@martint
Copy link
Member

martint commented Aug 19, 2022

Here's a simpler reproduction:

WITH t AS (
    SELECT CAST(null AS varchar) AS x, CAST(null AS varchar) AS y
    FROM (VALUES 1) t(a) JOIN (VALUES 1) u(a) USING (a)
)
SELECT * FROM t
WHERE CAST(x AS BIGINT) IS NOT NULL AND y = 'hello';

@findepi
Copy link
Member

findepi commented Aug 22, 2022

I believe the failure is not guaranteed, hence correctness label.

@findepi
Copy link
Member

findepi commented Aug 22, 2022

@martint supposes it's not a correctness issue. I have no evidence it is, so removing the label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

3 participants