planner,expression: use constraint propagation in partition pruning #8885

Conversation

### What problem does this PR solve?

Fix #7516

Now we can prune some of the `to_days` function:

``````CREATE TABLE t1 (
a int(10) unsigned NOT NULL,
b DATETIME NOT NULL,
PRIMARY KEY (a, b)
) PARTITION BY RANGE (TO_DAYS(b))
(PARTITION p20090401 VALUES LESS THAN (TO_DAYS('2009-04-02')),
PARTITION p20090402 VALUES LESS THAN (TO_DAYS('2009-04-03')),
PARTITION p20090403 VALUES LESS THAN (TO_DAYS('2009-04-04')),
PARTITION p20090404 VALUES LESS THAN (TO_DAYS('2009-04-05')),
PARTITION p20090405 VALUES LESS THAN MAXVALUE);
``````

The partition expression is `TO_DAYS(b) < XXX and TO_DAYS(b) >= YYY`, it has a function `to_days`.

``````EXPLAIN SELECT * FROM t1 WHERE b <= CAST('2009-04-03' AS DATE);
Union_9	9970.00	root
│ └─Selection_11	3323.33	cop	le(test.t1.b, 2009-04-03)
│   └─TableScan_10	10000.00	cop	table:t1, partition:p20090401, range:[-inf,+inf], keep order:false, stats:pseudo
│ └─Selection_14	3323.33	cop	le(test.t1.b, 2009-04-03)
│   └─TableScan_13	10000.00	cop	table:t1, partition:p20090402, range:[-inf,+inf], keep order:false, stats:pseudo
└─Selection_17	3323.33	cop	le(test.t1.b, 2009-04-03)
└─TableScan_16	10000.00	cop	table:t1, partition:p20090403, range:[-inf,+inf], keep order:false, stats:pseudo
``````

From the result we can see that `p20090404`, `p20090405` is pruned.

### What is changed and how it works?

Fix those two restrictions mentioned in the issue:

1. it doesn't support filter conditions that can't be push down to datasource
2. it doesn't support expressions that can't calculate range

Now prune will consider both push down conditions and the filter conditions in the `selection`.
Leverage the constraint propagate rule in #8640
to handle expressions (function) that can't calculate range.

There are still two minor problems, but they are not introduced by this commit:

1. The partition expressions for the first partition is `col < const or col is null`, current partition pruning can't prune `or col is null`
2. Range calculating is still kept in the code, we should get rid of it eventually

### Check List

Tests

• Unit test
• Integration test

This change is

### tiancaiamao commented Jan 2, 2019

 This is still WIP, and the changes are based on #8640 @zz-jason

 LGTM

 if partCol == nil {

#### zz-jason Jan 8, 2019

Member

can we check `partCol == nil` first when entering this function?

#### tiancaiamao Jan 8, 2019

Author Contributor

Do you mean check `partCol == nil` at the beginning of this function and return false? No we can't do that.

In `to_days(c) > xx and c < yy`, `partCol` is nil, but I want to handle it.
If this branch moved to the beginning, the `canBePruned` will return immediately.

#### zz-jason Jan 8, 2019

Member

Got it. Could you add a comment about this to let others know why we intentionally put this check after `solver.Solve()`?

 LGTM

tiancaiamao commented Jan 15, 2019

/run-all-tests

 /run-all-tests
tiancaiamao commented Jan 15, 2019

/run-unit-test

tiancaiamao commented Jan 15, 2019

/run-all-tests

tiancaiamao commented Jan 16, 2019

PTAL @winoros

 PTAL @winoros
tiancaiamao commented Jan 17, 2019

/run-all-tests

 /run-all-tests

