Skip to content

Source part metadata version X is newer then the table metadata Y after table restore #67457

@MatthewRBruce

Description

@MatthewRBruce

Describe the unexpected behaviour
It appears that the built in ClickHouse Backup and Restore doesn't write (or restore) a table's metadata version correctly - it gets set back to 0 after a restore:

How to reproduce
Clickhouse Version 24.5.1.1763
First create a table:

CREATE TABLE backup_test_db.base__core__users_sharded
(
    `user_id` UInt64,
    `shop_id` UInt64,
    `first_name` Nullable(String),
    `_row_version` DateTime64(3)
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/backup_test_db/{shard}/base__core__users_sharded', '{replica}', _row_version)
PRIMARY KEY (shop_id, user_id)
ORDER BY (shop_id, user_id)
SETTINGS index_granularity = 8192, clean_deleted_rows = 'Never'

Then insert some data into it:

insert into base__core__users_sharded (`user_id`, `shop_id`, `first_name`) VALUES (0, 0, 'Test1')
insert into base__core__users_sharded (`user_id`, `shop_id`, `first_name`) VALUES (1, 0, 'Test2')
insert into base__core__users_sharded (`user_id`, `shop_id`, `first_name`) VALUES (2, 0, 'Test3')

We can see in the output below that the metadata_version of the table is 0:

clickhouse :) select * from system.tables where database='backup_test_db';\G

SELECT *
FROM system.tables
WHERE database = 'backup_test_db'

Query id: 7c6057c8-4dd4-48db-8a41-dd3026f8ffba

Row 1:
──────
database:                      backup_test_db
name:                          base__core__users_sharded
uuid:                          c2eecb57-b7a6-4512-9a6f-04c7b3cb7510
engine:                        ReplicatedReplacingMergeTree
is_temporary:                  0
data_paths:                    ['/var/lib/clickhouse/disks/data/0/store/c2e/c2eecb57-b7a6-4512-9a6f-04c7b3cb7510/','/var/lib/clickhouse/disks/gcs_metadata/store/c2e/c2eecb57-b7a6-4512-9a6f-04c7b3cb7510/']
metadata_path:                 /var/lib/clickhouse/store/2cc/2cc7b27f-a516-4adb-ae88-da62b0ab8d35/base__core__users_sharded.sql
metadata_modification_time:    2024-07-22 17:26:30
metadata_version:              0
dependencies_database:         []
dependencies_table:            []
create_table_query:            CREATE TABLE backup_test_db.base__core__users_sharded (`user_id` UInt64, `shop_id` UInt64, `first_name` Nullable(String), `_row_version` DateTime64(3)) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/backup_test_db/{shard}/base__core__users_sharded', '{replica}', _row_version) PRIMARY KEY (shop_id, user_id) ORDER BY (shop_id, user_id) SETTINGS index_granularity = 8192, clean_deleted_rows = 'Never'
engine_full:                   ReplicatedReplacingMergeTree('/clickhouse/tables/backup_test_db/{shard}/base__core__users_sharded', '{replica}', _row_version) PRIMARY KEY (shop_id, user_id) ORDER BY (shop_id, user_id) SETTINGS index_granularity = 8192, clean_deleted_rows = 'Never'
as_select:
partition_key:
sorting_key:                   shop_id, user_id
primary_key:                   shop_id, user_id
sampling_key:
storage_policy:                default
total_rows:                    3
total_bytes:                   1440
total_bytes_uncompressed:      628
parts:                         3
active_parts:                  3
total_marks:                   6
lifetime_rows:                 ᴺᵁᴸᴸ
lifetime_bytes:                ᴺᵁᴸᴸ
comment:
has_own_data:                  1
loading_dependencies_database: []
loading_dependencies_table:    []
loading_dependent_database:    []
loading_dependent_table:       []

1 row in set. Elapsed: 0.004 sec.

If we check the metadata_version of one of the parts on disk, it is also 0:

$ cat /var/lib/clickhouse/disks/data/0/store/c2e/c2eecb57-b7a6-4512-9a6f-04c7b3cb7510/all_2_2_0/metadata_version.txt
0

Now we alter the table by adding a new column which will increment it's metadata version:

clickhouse :) ALTER TABLE backup_test_db.base__core__users_sharded ADD COLUMN `last_name` Nullable(String) after `first_name`

clickhouse :) select * from system.tables where database='backup_test_db';\G

SELECT *
FROM system.tables
WHERE database = 'backup_test_db'

Query id: ea4d23d2-65b5-4cbe-bc0d-db96a446a4fb

Row 1:
──────
database:                      backup_test_db
name:                          base__core__users_sharded
uuid:                          c2eecb57-b7a6-4512-9a6f-04c7b3cb7510
engine:                        ReplicatedReplacingMergeTree
is_temporary:                  0
data_paths:                    ['/var/lib/clickhouse/disks/data/0/store/c2e/c2eecb57-b7a6-4512-9a6f-04c7b3cb7510/','/var/lib/clickhouse/disks/gcs_metadata/store/c2e/c2eecb57-b7a6-4512-9a6f-04c7b3cb7510/']
metadata_path:                 /var/lib/clickhouse/store/2cc/2cc7b27f-a516-4adb-ae88-da62b0ab8d35/base__core__users_sharded.sql
metadata_modification_time:    2024-07-22 17:43:42
metadata_version:              1
dependencies_database:         []
dependencies_table:            []
create_table_query:            CREATE TABLE backup_test_db.base__core__users_sharded (`user_id` UInt64, `shop_id` UInt64, `first_name` Nullable(String), `last_name` Nullable(String), `_row_version` DateTime64(3)) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/backup_test_db/{shard}/base__core__users_sharded', '{replica}', _row_version) PRIMARY KEY (shop_id, user_id) ORDER BY (shop_id, user_id) SETTINGS index_granularity = 8192, clean_deleted_rows = 'Never'
engine_full:                   ReplicatedReplacingMergeTree('/clickhouse/tables/backup_test_db/{shard}/base__core__users_sharded', '{replica}', _row_version) PRIMARY KEY (shop_id, user_id) ORDER BY (shop_id, user_id) SETTINGS index_granularity = 8192, clean_deleted_rows = 'Never'
as_select:
partition_key:
sorting_key:                   shop_id, user_id
primary_key:                   shop_id, user_id
sampling_key:
storage_policy:                default
total_rows:                    3
total_bytes:                   1440
total_bytes_uncompressed:      628
parts:                         3
active_parts:                  3
total_marks:                   6
lifetime_rows:                 ᴺᵁᴸᴸ
lifetime_bytes:                ᴺᵁᴸᴸ
comment:
has_own_data:                  1
loading_dependencies_database: []
loading_dependencies_table:    []
loading_dependent_database:    []
loading_dependent_table:       []

1 row in set. Elapsed: 0.003 sec.

We add some new data into the table which should generate parts with metadata version 1:

insert into backup_test_db.base__core__users_sharded (`user_id`, `shop_id`, `first_name`, `last_name`) VALUES (3, 0, 'Test4', 'Test4')
insert into backup_test_db.base__core__users_sharded (`user_id`, `shop_id`, `first_name`, `last_name`) VALUES (4, 0, 'Test5', 'Test5')

$ cat /var/lib/clickhouse/disks/data/0/store/c2e/c2eecb57-b7a6-4512-9a6f-04c7b3cb7510/all_4_4_0/metadata_version.txt
1

Now we perform a Backup of the tables on both replicas:
On Replica 0:

BACKUP TABLE backup_test_db.base__core__users_sharded TO Disk('gcs_backups', 'clickhouse-2-0/backup_test_db/full/2024-07-22')

On Replica 1:

BACKUP TABLE backup_test_db.base__core__users_sharded TO Disk('gcs_backups', 'clickhouse-2-1/backup_test_db/full/2024-07-22')

Then drop the tables on both replicas:

DROP TABLE  backup_test_db.base__core__users_sharded  SYNC

and then restore the tables (full restore on replica 0, structure only on replica 1)
On Replica 0:

RESTORE TABLE backup_test_db.base__core__users_sharded FROM Disk('gcs_backups', 'clickhouse-2-0/backup_test_db/full/2024-07-22')

On Replica 1:

RESTORE TABLE backup_test_db.base__core__users_sharded FROM Disk('gcs_backups', 'clickhouse-2-1/backup_test_db/full/2024-07-22') SETTINGS structure_only=true

Now we insert some data which should trigger a merge after a while:

insert into backup_test_db.base__core__users_sharded (`user_id`, `shop_id`, `first_name`, `last_name`) VALUES (5, 0, 'Test6', 'Test6')
insert into backup_test_db.base__core__users_sharded (`user_id`, `shop_id`, `first_name`, `last_name`) VALUES (6, 0, 'Test7', 'Test7')
insert into backup_test_db.base__core__users_sharded (`user_id`, `shop_id`, `first_name`, `last_name`) VALUES (7, 0, 'Test8', 'Test8')
insert into backup_test_db.base__core__users_sharded (`user_id`, `shop_id`, `first_name`, `last_name`) VALUES (8, 0, 'Test9', 'Test9'

And after a while we see this error in the debug logs regarding Merges:

2024.07.22 18:38:47.081610 [ 696 ] {} <Debug> 6384a81f-f127-48e5-8b84-70a5b6780cb8::all_0_4_1 (MergeFromLogEntryTask): Source part metadata version 1 is newer then the table metadata version 0. ALTER_METADATA is still in progress.
2024.07.22 18:38:47.084483 [ 697 ] {} <Debug> 6384a81f-f127-48e5-8b84-70a5b6780cb8::all_0_4_1 (MergeFromLogEntryTask): Source part metadata version 1 is newer then the table metadata version 0. ALTER_METADATA is still in progress.
2024.07.22 18:38:47.087278 [ 679 ] {} <Debug> 6384a81f-f127-48e5-8b84-70a5b6780cb8::all_0_4_1 (MergeFromLogEntryTask): Source part metadata version 1 is newer then the table metadata version 0. ALTER_METADATA is still in progress.
2024.07.22 18:38:47.090295 [ 694 ] {} <Debug> 6384a81f-f127-48e5-8b84-70a5b6780cb8::all_0_4_1 (MergeFromLogEntryTask): Source part metadata version 1 is newer then the table metadata version 0. ALTER_METADATA is still in progress.
2024.07.22 18:38:47.093332 [ 709 ] {} <Debug> 6384a81f-f127-48e5-8b84-70a5b6780cb8::all_0_4_1 (MergeFromLogEntryTask): Source part metadata version 1 is newer then the table metadata version 0. ALTER_METADATA is still in progress.

Expected behavior
A Backup and Restore should set the metadata version in Zookeeper/Keeper correctly.

Additional context
Note that we can work around this problem by manually setting the metadata versions for each table and replica in Zookeeper/Keeper, but it takes a while to sort out the correct metadata versions for all the tables and set them:

SET /clickhouse/tables/<database>/<replica>/<table>/metadata_version "1"

Metadata

Metadata

Assignees

No one assigned

    Labels

    unexpected behaviourResult is unexpected, but not entirely wrong at the same time.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions