You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we do not use materialised columns to optimise queries automatically.
Projections allow to replace expression to precomputed column in projection.
It makes even more sense when materialized column exists in PK.
If this features and optimisations will be implemented it is possible to add MATERIALIZED columns that will improve user queries without need to change queries itself. Moreover there are several materialized columns and projections ClickHouse can choose best variant.
Example
CREATETABLEusers_m (uid Int16, name String, age Int16, name_len UInt8 MATERIALIZED length(name)) ENGINE=MergeTree ORDER BY (name_len, name);
CREATETABLEusers_p (uid Int16, name String, age Int16,
PROJECTION p1
(
SELECT
uid, name, age, length(name)
ORDER BY length(name), name
)
) ENGINE=MergeTree ORDER BY (name);
INSERT INTO users_m VALUES (1231, 'John', 33);
INSERT INTO users_m VALUES (6666, 'Ksenia', 48);
INSERT INTO users_m VALUES (8888, 'Alice', 50);
INSERT INTO users_p VALUES (1231, 'John', 33);
INSERT INTO users_p VALUES (6666, 'Ksenia', 48);
INSERT INTO users_p VALUES (8888, 'Alice', 50);
SELECT'Here we do full scan';
EXPLAIN indexes=1SELECT name FROM users_m WHERE length(name) <5;
SELECT'Here we use projection and PK';
EXPLAIN indexes=1SELECT name FROM users_p WHERE length(name) <5;
Here we do full scan
Expression ((Projection + Before ORDER BY))
Filter (WHERE)
ReadFromMergeTree (default.users_m)
Indexes:
PrimaryKey
Condition: true
Parts: 3/3
Granules: 3/3
Here we use projection and PK
Expression ((Projection + Before ORDER BY))
Filter
ReadFromMergeTree (p1)
Indexes:
PrimaryKey
Keys:
length(name)
Condition: (length(name) in (-Inf, 4])
Parts: 1/3
Granules: 1/3
ClickHouse guarantees that MATERIALIZED column always corresponds to source
MATERIALIZED column should be updated during ALTER UPDATE if column that is used in materialized expression is changed.
If MATERIALIZED expression is changed in ALTER TABLE query column should be invalidated and recomputed.
No manual inserts/alter update of materialized column allowed
ClickHouse can replace same expression in a query to a precomputed materialized column as we do for projection.
Optimisation is applied before index analysis to be able to use PK
Describe alternatives you've considered
Separate type for such materialized column that has more constraints on expression(can't use ephemeral for example)
Simplified PROJECTION case when only new columns are added without different ORDER BY - store this precomputed columns as ordinary columns in the same table.
ASSUME CONSTRAINT / hypothesis: if we can add automatically add hypothesis for existing MATERIALIZED COLUMNS. This one actually works, we just do not add CONSTRAINT for MATERIALIZED expression automatically:
Expression ((Projection + Before ORDER BY))
Filter (WHERE)
ReadFromMergeTree (default.users_a)
Indexes:
PrimaryKey
Keys:
name_len
Condition: and((name_len in (-Inf, 4]), (name_len in (-Inf, 4]))
Parts: 1/3
Granules: 1/3
ALTER UPDATE can't trigger update materialised column change if this column is used in PK/partition same as Updating columns that are used in the calculation of the primary or the partition key is not supported.
Currently MATERIALIZED can use EPHEMERAL column in expression. This scenario is valid, but should have same behaviour as now - value is computed on insertion and is not updated even if some other columns in expression are updated.
The text was updated successfully, but these errors were encountered:
We also have a similar mechanism - optimization with CONSTRAINTs.
It is mentioned in alternatives. It works but requires 3 settings to be enabled to work and this feature is even not documented - it may be not production ready. But having feature "materializes_as_constraint" can be a good small step to start with
Use case
Currently we do not use materialised columns to optimise queries automatically.
Projections allow to replace expression to precomputed column in projection.
It makes even more sense when materialized column exists in PK.
If this features and optimisations will be implemented it is possible to add MATERIALIZED columns that will improve user queries without need to change queries itself. Moreover there are several materialized columns and projections ClickHouse can choose best variant.
Example
Here we do full scan
Here we use projection and PK
https://fiddle.clickhouse.com/69dfb8f5-5713-4959-8f95-30f3381df81e
Describe the solution you'd like
Describe alternatives you've considered
https://fiddle.clickhouse.com/30a2e2a1-ed60-4686-b11b-2ecd30445eba
Additional context
I see following problems to solve:
Updating columns that are used in the calculation of the primary or the partition key is not supported.
The text was updated successfully, but these errors were encountered: