Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom partition: partition elimination issue? 1.1.54380. #2342

Closed
den-crane opened this issue May 11, 2018 · 3 comments
Closed

custom partition: partition elimination issue? 1.1.54380. #2342

den-crane opened this issue May 11, 2018 · 3 comments

Comments

@den-crane
Copy link
Contributor

den-crane commented May 11, 2018

CH reads all partitions in some cases.

issue:

CREATE TABLE test.testy(d Date, n Int64, k Int64, arr Array(String))
ENGINE = MergeTree Partition by (n,toStartOfMonth(d)) Order by (k);

insert into test.testy select toDate('2018-05-01') as d, 33, 0, []  from numbers(16);
insert into test.testy select toDate('2018-05-02') as d, 33, 0, []  from numbers(933);
insert into test.testy select toDate('2018-05-03') as d, 33, 0, []  from numbers(20000000);

SELECT count() FROM test.testy PREWHERE (n = -1) AND (d >= '2018-05-01');

0 rows in set. Elapsed: 0.014 sec. Processed 14.68 million rows, 146.81 MB (1.05 billion rows/s., 10.55 GB/s.)

Key condition: unknown, unknown, and
MinMax index condition: (column 1 in [-1, -1]), (column 0 in [17652, +inf)), and
Selected 1 parts by date, 1 parts by key, 2433 marks to read from 1 ranges
Reading approx. 19931136 rows


If I change the order of inserts, then no issue?

drop table test.testy;

CREATE TABLE test.testy(d Date, n Int64, k Int64, arr Array(String)) ENGINE = MergeTree Partition by (n,toStartOfMonth(d)) Order by (k);

insert into test.testy select toDate('2018-05-03') as d, 33, 0, []  from numbers(20000000);
insert into test.testy select toDate('2018-05-01') as d, 33, 0, []  from numbers(16);
insert into test.testy select toDate('2018-05-02') as d, 33, 0, []  from numbers(933);

SELECT count() FROM test.testy PREWHERE (n = -1) AND (d >= '2018-05-01');
0 rows in set. Elapsed: 0.002 sec.

Key condition: unknown, unknown, and
MinMax index condition: (column 1 in [-1, -1]), (column 0 in [17652, +inf)), and
Selected 0 parts by date, 0 parts by key, 0 marks to read from 0 ranges

@den-crane
Copy link
Contributor Author

den-crane commented May 11, 2018

drop table test.testy;

CREATE TABLE test.testy(d Int64, n Int64) ENGINE = MergeTree Partition by (ceil(d/2), n) Order by tuple();

insert into test.testy select 1, 33 n from numbers(1);
insert into test.testy select 2, 33 n from numbers(10000000);
insert into test.testy select 1, 34 n from numbers(1);

SELECT count() FROM test.testy PREWHERE (n = 34);
1 rows in set. Elapsed: 0.007 sec. Processed 10.00 million rows, 80.00 MB (1.46 billion rows/s., 11.64 GB/s.)

SELECT count() FROM test.testy PREWHERE (n = 34 and d>=1);
1 rows in set. Elapsed: 0.014 sec. Processed 10.00 million rows, 160.00 MB (725.83 million rows/s., 11.61 GB/s.)

SELECT count() FROM test.testy PREWHERE (n = 34 and d=1);
1 rows in set. Elapsed: 0.014 sec. Processed 10.00 million rows, 160.00 MB (708.99 million rows/s., 11.34 GB/s.)

select partition, rows from system.parts where active = 1 and table ='testy'

(1, 34) 1
(1, 33) 10000001

@den-crane
Copy link
Contributor Author

den-crane commented May 11, 2018

Another order in partition by

drop table test.testy;

CREATE TABLE test.testy(d Int64, n Int64) ENGINE = MergeTree Partition by (n,ceil(d/2)) Order by tuple();

insert into test.testy select 1, 33 n from numbers(1);
insert into test.testy select 2, 33 n from numbers(10000000);
insert into test.testy select 1, 34 n from numbers(1);

SELECT count() FROM test.testy PREWHERE (n = 34);
1 rows in set. Elapsed: 0.007 sec. Processed 10.00 million rows, 80.00 MB (1.41 billion rows/s., 11.27 GB/s.)

SELECT count() FROM test.testy PREWHERE (n = 34 and d>=1);
1 rows in set. Elapsed: 0.013 sec. Processed 10.00 million rows, 160.00 MB (798.77 million rows/s., 12.78 GB/s.)

SELECT count() FROM test.testy PREWHERE (n = 34 and d=1);
1 rows in set. Elapsed: 0.013 sec. Processed 10.00 million rows, 160.00 MB (796.54 million rows/s., 12.74 GB/s.)

select partition, rows from system.parts where active = 1 and table ='testy'

(34, 1) 1
(33, 1) 10000001

@den-crane
Copy link
Contributor Author

den-crane commented May 11, 2018

Changed the order of inserts:

drop table test.testy;

CREATE TABLE test.testy(d Int64, n Int64) ENGINE = MergeTree Partition by (ceil(d/2),n) Order by tuple();

insert into test.testy select 2, 33 n from numbers(10000000);
insert into test.testy select 1, 33 n from numbers(1);
insert into test.testy select 1, 34 n from numbers(1);

SELECT count() FROM test.testy WHERE (n = 34 );
1 rows in set. Elapsed: 0.002 sec. <<<<--- Bingo!!!!

select partition, rows from system.parts where active = 1 and table ='testy'

(1, 34) 1
(1, 33) 10000000
(1, 33) 1

optimize table test.testy;

select partition, rows from system.parts where active = 1 and table ='testy'

(1, 34) 1
(1, 33) 10000001

SELECT count() FROM test.testy WHERE (n = 34 );
1 rows in set. Elapsed: 0.007 sec. Processed 10.00 million rows, 80.00 MB (1.38 billion rows/s., 11.07 GB/s.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant