Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwrap casts in comparison expressions #731

Open
wants to merge 4 commits into
base: master
from

Conversation

3 participants
@martint
Copy link
Member

commented May 8, 2019

This change allows the engine to infer that, for instance, given x::smallint

cast(x as bigint) = bigint '1'

can be rewritten as

x = smallint '1'

It generalizes to every comparison operator. It can also infer that:

cast(x as bigint) > bigint '10000000'

can only return null or false.

It's implemented as a generic rule that can run and apply at arbitrary points
during the optimization process. It's a generalization of what DomainTranslator
does in PushPredicateIntoTableScan.

Unlike DomainTranslator, it doesn't require the SATURATED_FLOOR_CAST operator,
and relies on types being able to describe the range of valid values. Thus,
it's a long-term replacement for SATURATED_FLOOR_CAST.

Given the following tables:

CREATE TABLE t1 (v smallint);
CREATE TABLE t2 (v bigint);

And the following query:

SELECT *
FROM t1 JOIN t2 ON t1.v = t2.v
WHERE t1.v = BIGINT '1';

The plan before this change is:

  - Output[name]
    - InnerJoin[("expr" = "v_0")]
      - ScanFilterProject[table = memory:9, filterPredicate = (CAST("v" AS bigint) = BIGINT '1')]
          expr := CAST("v" AS bigint)
          v := 0
      - TableScan[memory:8]
          v_0 := 0

And after this change:

  - Output[name]
    - CrossJoin
      - ScanFilterProject[table = memory:9, filterPredicate = ("v" = SMALLINT '1')]
          v := 0
      - ScanFilterProject[table = memory:8, filterPredicate = ("v_0" = BIGINT '1')]
          v_0 := 0

@cla-bot cla-bot bot added the cla-signed label May 8, 2019

@martint martint force-pushed the martint:unwrap-cast branch 5 times, most recently from 53ee886 to dbc3a4f May 8, 2019

@martint

This comment has been minimized.

Copy link
Member Author

commented May 9, 2019

@findepi, I've realized that since the semantics of CAST with respect to truncation or rounding are not well-defined, we can't rely on the roundtrip logic I have in this PR. Hence, that's why we have the "floor cast" operation, which is a CAST with guaranteed truncation semantics (in addition to saturation if the value is outside the range). I'll adjust the logic to use "saturated floor cast", instead.

@martint

This comment has been minimized.

Copy link
Member Author

commented May 10, 2019

If we had both a floor_cast and a ceiling_cast, or a next_value/previous_value, we wouldn't need types to describe their range. Unfortunately, with just floor_cast, it's impossible to distinguish between a value saturating at the upper end vs the literal not having a representation in the source type.

@martint martint added the WIP label May 10, 2019

@martint martint force-pushed the martint:unwrap-cast branch 6 times, most recently from f23a965 to 71b723c May 12, 2019

@martint martint removed the WIP label May 13, 2019

@martint martint requested a review from findepi May 13, 2019

@martint

This comment has been minimized.

Copy link
Member Author

commented May 13, 2019

@findepi, I applied review comments, fixed some outstanding issues and added a bunch of tests. Please take another look.

@martint martint force-pushed the martint:unwrap-cast branch from 71b723c to e3c2e99 May 13, 2019

@martint martint force-pushed the martint:unwrap-cast branch 2 times, most recently from 1105ac6 to 0b8b04c May 14, 2019

@martint martint force-pushed the martint:unwrap-cast branch 2 times, most recently from 3326423 to 8221a97 May 17, 2019

martint added some commits May 8, 2019

Remove duplicate predicates
Remove duplicate predicates in logical binary expressions (AND, OR).
Canonicalizes commutative arithmetic expressions and comparisons to
handle a larger number of variants.
Unwrap casts in comparison expressions
This change allows the engine to infer that, for instance, given x::smallint

    cast(x as bigint) = bigint '1'

can be rewritten as

    x = smallint '1'

It generalizes to every comparison operator. It can also infer that:

    cast(x as bigint) > bigint '10000000'

can only return null or false.

It's implemented as a generic rule that can run and apply at arbitrary points
during the optimization process. It's a generalization of what DomainTranslator
does in PushPredicateIntoTableScan.

Unlike DomainTranslator, it doesn't require the SATURATED_FLOOR_CAST operator,
and relies on types being able to describe the range of valid values. Thus,
it's a long-term replacement for SATURATED_FLOOR_CAST.

Given the following tables:

    CREATE TABLE t1 (v smallint);
    CREATE TABLE t2 (v bigint);

And the following query:

    SELECT *
    FROM t1 JOIN t2 ON t1.v = t2.v
    WHERE t1.v = BIGINT '1';

The plan before this change is:

  - Output[name]
    - InnerJoin[("expr" = "v_0")]
      - ScanFilterProject[table = memory:9, filterPredicate = (CAST("v" AS bigint) = BIGINT '1')]
          expr := CAST("v" AS bigint)
          v := 0
      - TableScan[memory:8]
          v_0 := 0

And after this change:

  - Output[name]
    - CrossJoin
      - ScanFilterProject[table = memory:9, filterPredicate = ("v" = SMALLINT '1')]
          v := 0
      - ScanFilterProject[table = memory:8, filterPredicate = ("v_0" = BIGINT '1')]
          v_0 := 0

martint added some commits May 17, 2019

Fix rendering of float values in cast error message
It was showing the encoded value (long) instead of the value
as a floating point number.

@martint martint force-pushed the martint:unwrap-cast branch from 8221a97 to 735f7ae May 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.