Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Right argument of IN is evaluated in a column default expression in 22.10+ #44496

Closed
Algunenano opened this issue Dec 21, 2022 · 4 comments · Fixed by #44547
Closed

Right argument of IN is evaluated in a column default expression in 22.10+ #44496

Algunenano opened this issue Dec 21, 2022 · 4 comments · Fixed by #44547
Assignees
Labels
bug Confirmed user-visible misbehaviour in official release

Comments

@Algunenano
Copy link
Member

Algunenano commented Dec 21, 2022

Describe the issue

When you add a new replica to a table using certain operations (joinGet in the definition in this example), but values of the table might be leaked into the ZK definition leading to issues when creating a table.

How to reproduce

Setup

2 replicas
1 cluster (in this case called tinybird)

In the first replica we execute:

SELECT version() from clusterAllReplicas('tinybird', system.one);
DROP DATABASE IF EXISTS upgrade_problem ON CLUSTER tinybird SYNC;
CREATE DATABASE upgrade_problem ON CLUSTER tinybird;
CREATE TABLE upgrade_problem.id_join ON CLUSTER tinybird
(
    `country` String,
    `location` Array(Int32)
)
ENGINE = Join(ANY, LEFT, country);

The above has created a database upgrade_problem on cluster and a join table, also on cluster.

Now in the first replica we add data:

INSERT INTO upgrade_problem.id_join values ('CLICK', [1234]);  -- Only adds data to server1 by design

And we also create a new table that uses this info (not created on cluster, only in the server1):

CREATE TABLE upgrade_problem.dep
    (
    `id` Int32,
    `country` LowCardinality(String),
    `purchase_location` UInt16 MATERIALIZED if(id IN joinGet('upgrade_problem.id_join', 'location', 'CLICK'), 123, 456)
    )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/upgrade_problem.dep', '{replica}')
ORDER BY tuple();

Now we go into server2 and add the replica manually (not via on cluster):

CREATE TABLE upgrade_problem.dep
    (
    `id` Int32,
    `country` LowCardinality(String),
    `purchase_location` UInt16 MATERIALIZED if(id IN joinGet('upgrade_problem.id_join', 'location', 'CLICK'), 123, 456)
    )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/upgrade_problem.dep', '{replica}')
ORDER BY tuple();

Results:

  • 22.8.5.29: No problem
  • 22.9.3.18: No problem
  • 22.10.2.11: Fails
  • 22.11.2.30: Fails
  • 22.12.1.1752: Fails

Example error:

production-02 :) CREATE TABLE upgrade_problem.dep
                     (
                     `id` Int32,
                     `country` LowCardinality(String),
                     `purchase_location` UInt16 MATERIALIZED if(id IN joinGet('upgrade_problem.id_join', 'location', 'CLICK'), 123, 456)
                     )
                 ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/upgrade_problem.dep', '{replica}')
                 ORDER BY tuple();

CREATE TABLE upgrade_problem.dep
(
    `id` Int32,
    `country` LowCardinality(String),
    `purchase_location` UInt16 MATERIALIZED if(id IN joinGet('upgrade_problem.id_join', 'location', 'CLICK'), 123, 456)
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/upgrade_problem.dep', '{replica}')
ORDER BY tuple()

Query id: 57930d18-0b8c-4f31-9fed-3e91c4d8e918


Received exception from server (version 22.10.2):
Code: 122. DB::Exception: Received from clickhouse-02:59000. DB::Exception: Table columns structure in ZooKeeper is different from local table structure. Local columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN [], 123, 456)

Zookeeper columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN [1234], 123, 456)
. (INCOMPATIBLE_COLUMNS)

For some reason the definition comparison between the 2 servers has replaced joinGet('upgrade_problem.id_join', 'location', 'CLICK') by its result. Since the result is different ([1234] vs []), the comparison fails.

This breaks adding new replicas. Note that this also happens, and it's made 10x worse, even when the data matches but the new replica is newer.

For example, if you create the first replica with 22.9 and then add a new one with 22.12 you will get this error:

Received exception from server (version 22.12.1):
Code: 122. DB::Exception: Received from clickhouse-02:59000. DB::Exception: Table columns structure in ZooKeeper is different from local table structure. Local columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN [], 123, 456)

Zookeeper columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN joinGet(\'upgrade_problem.id_join\', \'location\', \'CLICK\'), 123, 456)
. (INCOMPATIBLE_COLUMNS)

Even if you match the data the definition still changes:

Code: 122. DB::Exception: Received from clickhouse-02:59000. DB::Exception: Table columns structure in ZooKeeper is different from local table structure. Local columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN [1234], 123, 456)

Zookeeper columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN joinGet(\'upgrade_problem.id_join\', \'location\', \'CLICK\'), 123, 456)
. (INCOMPATIBLE_COLUMNS)

Note the diff where it was [] above and [1234] after adding the data.

This to me is a major breaking compatibility issue as you are unable to add new replicas to existing clusters anymore.

@Algunenano
Copy link
Member Author

Note that if the table was originally created in 22.9 or previous, you updated the whole cluster and now try to add a new replica it won't work either:

production-02 :) SELECT version() from clusterAllReplicas('tinybird', system.one);

SELECT version()
FROM clusterAllReplicas('tinybird', system.one)

Query id: dfe9144f-b570-41e5-9465-9d20441bb8e3

┌─version()────┐
│ 22.12.1.1752 │
└──────────────┘
┌─version()────┐
│ 22.12.1.1752 │
└──────────────┘

2 rows in set. Elapsed: 0.004 sec. 

production-02 :) CREATE TABLE upgrade_problem.dep
                     (
                     `id` Int32,
                     `country` LowCardinality(String),
                     `purchase_location` UInt16 MATERIALIZED if(id IN joinGet('upgrade_problem.id_join', 'location', 'CLICK'), 123, 456)
                     )
                 ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/upgrade_problem.dep', '{replica}')
                 ORDER BY tuple();

CREATE TABLE upgrade_problem.dep
(
    `id` Int32,
    `country` LowCardinality(String),
    `purchase_location` UInt16 MATERIALIZED if(id IN joinGet('upgrade_problem.id_join', 'location', 'CLICK'), 123, 456)
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/upgrade_problem.dep', '{replica}')
ORDER BY tuple()

Query id: 4c0585fa-a450-4062-acca-6d3bfbe8eabb


0 rows in set. Elapsed: 0.019 sec. 

Received exception from server (version 22.12.1):
Code: 122. DB::Exception: Received from clickhouse-02:59000. DB::Exception: Table columns structure in ZooKeeper is different from local table structure. Local columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN [1234], 123, 456)

Zookeeper columns:
columns format version: 1
3 columns:
`id` Int32
`country` LowCardinality(String)
`purchase_location` UInt16      MATERIALIZED    if(id IN joinGet(\'upgrade_problem.id_join\', \'location\', \'CLICK\'), 123, 456)
. (INCOMPATIBLE_COLUMNS)


@Algunenano
Copy link
Member Author

690ec74bf235bd7c84b88acb8820f8d6efff9ec2 is the first bad commit
commit 690ec74bf235bd7c84b88acb8820f8d6efff9ec2
Author: Alexander Tokmakov <tavplubix@clickhouse.com>
Date:   Wed Oct 5 20:58:27 2022 +0200

    better handling for expressions in dictGet

 src/Databases/DDLDependencyVisitor.cpp             | 73 +++++++++++++++++++---
 src/Databases/DDLDependencyVisitor.h               | 31 ++++++++-
 .../getDictionaryConfigurationFromAST.cpp          |  5 ++
 src/Interpreters/InterpreterCreateQuery.cpp        | 20 ++++++
 ...check_dependencies_and_table_shutdown.reference |  6 ++
 ...02449_check_dependencies_and_table_shutdown.sql | 40 ++++++++++++
 6 files changed, 163 insertions(+), 12 deletions(-)
 create mode 100644 tests/queries/0_stateless/02449_check_dependencies_and_table_shutdown.reference
 create mode 100644 tests/queries/0_stateless/02449_check_dependencies_and_table_shutdown.sql

It seems #42106 might have changed (and broken) replication when the columns use *Get functions.

cc @tavplubix

@tavplubix tavplubix self-assigned this Dec 22, 2022
@tavplubix
Copy link
Member

For some reason the definition comparison between the 2 servers has replaced joinGet('upgrade_problem.id_join', 'location', 'CLICK') by its result.

It was not intended and it's a bug. The idea was to replace expressions like dictGet(currentDatabase() || '.dict', x, y) with dictGet('db_name.dict', x, y), not to evaluate all default expressions. The problem is that the function in is much more tricky and its second argument can be a table name or a tuple or a subquery. But the table name must be an identifier (not a string literal), so actually we don't need to evaluate and replace anything for in. I will fix it.

It's not really a compatibility issue, it's just a bug that leads to broken metadata. It's easy to add a new replica using a new version:

CREATE TABLE upgrade_problem.dep
(
 ...
    `purchase_location` UInt16 MATERIALIZED if(id IN [1234], 123, 456)
) ...

so metadata will match.

@Algunenano
Copy link
Member Author

Understood.

Also note that in the case where the table was created in a previous version (and the metadata matches the table declaration without replaces), the solution is to add the replica with an older release and then upgrade it instead of adding the replica with the new version directly.

@tavplubix tavplubix added bug Confirmed user-visible misbehaviour in official release and removed backward compatibility labels Dec 22, 2022
@tavplubix tavplubix changed the title Broken ZK columns for joinGet in 22.10+ (breaking compatibility) Right argument of IN is evaluated in a column default expression in 22.10+ Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants