Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse is not using projection when sorting by primary index #58912

Open
Blackmorse opened this issue Jan 17, 2024 · 4 comments
Open

Clickhouse is not using projection when sorting by primary index #58912

Blackmorse opened this issue Jan 17, 2024 · 4 comments

Comments

@Blackmorse
Copy link

Checked on the latest version 23.12.2.59

Description
Say we have MergeTree table A with fields a and b. Sorting key is a, but there is a projection pr where sorting is by b.
When executing a query filtering by b (b = <some_value>) and ordering by a, projection is not applied

Steps to reproduce

CREATE TABLE test
(
    `a` UInt64,
    `b` UInt64,
    PROJECTION pr
    (
        SELECT *
        ORDER BY b
    )
)
ENGINE = MergeTree
ORDER BY a;

INSERT INTO test SELECT
    number,
    number
FROM numbers(1000000);

Now where executing the query:

SELECT *
FROM test
WHERE b = 300
ORDER BY a ASC

it's doing a full scan: 1 row in set. Elapsed: 0.007 sec. Processed 1.00 million rows, 8.00 MB (144.26 million rows/s., 1.15 GB/s.)

Without ORDER BY a it's fine:

SELECT *
FROM test
WHERE b = 300

Query id: 42f0831d-4cb1-433b-b942-78296d1bbfe3

┌───a─┬───b─┐
│ 300 │ 300 │
└─────┴─────┘

1 row in set. Elapsed: 0.003 sec. Processed 8.19 thousand rows, 131.07 KB (2.83 million rows/s., 45.20 MB/s.)

Even preferred_optimize_projection_name is not resolving the issue:

SELECT *
FROM test
WHERE b = 300
ORDER BY a ASC
SETTINGS preferred_optimize_projection_name = 'pr'

Query id: 50215e6d-fb7d-48fd-b6d6-f7c341e7bd05

┌───a─┬───b─┐
│ 300 │ 300 │
└─────┴─────┘

1 row in set. Elapsed: 0.006 sec. Processed 1.00 million rows, 8.00 MB (168.36 million rows/s., 1.35 GB/s.)
@UnamedRus
Copy link
Contributor

What if you disable optimize_read_in_order?

@Blackmorse
Copy link
Author

@UnamedRus , yes, this settings does the job.

Still is it considered to be a correct behavior? Since optimize_read_in_order is on by default it can affect performance of such queries

@UnamedRus
Copy link
Contributor

Still is it considered to be a correct behavior?

No, ClickHouse should be smart enough to read data from projection if it's not much

@UnamedRus
Copy link
Contributor

Somewhat related #17364

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants