-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Open
Labels
Description
Company or project name
Observability --> logging use-case.
Describe the situation
equality match for fields should use primary key index even if primary key index entry has been derived via deterministic function.
That is:
CREATE TABLE test (
field_1 type_1
...
)
ENGINE = MergeTree()
ORDER BY ..., f(field_1, ...), ...
For queries like:
WHERE field_1 = <>
it should use primary key index, under assumption f is deterministic function.
Right now this is needed:
WHERE field_1 = <> AND f(field_1, ...)
to use primary key index
Which ClickHouse versions are affected?
25.4.3.22
How to reproduce
DROP TABLE IF EXISTS test;
CREATE TABLE test (
pod String
)
ENGINE = MergeTree()
ORDER BY left(pod, length(pod) - length(substringIndex(pod, '-', -1)) - 1);
INSERT INTO test
SELECT
arrayElement(
['vector-abc-001', 'vector-abc-002', 'metrics-def-003', 'metrics-def-004', 'logs-ghi-005', 'logs-ghi-006', 'traces-jkl-007', 'traces-jkl-008', 'events-mno-009', 'events-mno-010'],
(number % 10) + 1
)
FROM numbers(100000);
SELECT
pod,
left(pod, length(pod) - length(substringIndex(pod, '-', -1)) - 1)
FROM test
GROUP BY 1
LIMIT 10;
-- primary key index is NOT used
EXPLAIN indexes=1
SELECT count()
FROM test
WHERE pod = 'vector-abc-001';
-- primary key not used
EXPLAIN indexes=1
SELECT count()
FROM test
WHERE pod BETWEEN 'vector-abc-001' AND 'vector-abd';
-- primary key index finally used
EXPLAIN indexes=1
SELECT count()
FROM test
WHERE pod = 'vector-abc-001'
AND left(pod, length(pod) - length(substringIndex(pod, '-', -1)) - 1) = 'vector-abc';Expected performance
Primary key used for equality/range optimisations where primary key has derived field in index
Additional context
No response
Reactions are currently unavailable