You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our system is sensitive to duplicated data, so we use OPTIMIZE FINAL on ReplacingMergeTree tables to remove duplications.
When trying to run OPTIMIZE FINAL on 2 large tables (~2 billion tables & ~200 million) we get this error, on both tables:
Orig exception: Code: 74. DB::ErrnoException: Cannot read from file: /var/lib/clickhouse/store/c48/c487a9b6-3d95-42e8-b168-cd9c45a9758b/all_1_1134741_47/<column_name>.bin, errno: 22, strerror: Invalid argument: Cache info: Buffer path: /var/lib/clickhouse/store/c48/c487a9b6-3d95-42e8-b168-cd9c45a9758b/all_1_1134741_47/<column_name>.bin, hash key: d8713dc8fbd955852c4cac2756e46d72, file_offset_of_buffer_end: 0, internal buffer remaining read range: [0:5554177], read_type: REMOTE_FS_READ_AND_PUT_IN_CACHE, last caller: c487a9b6-3d95-42e8-b168-cd9c45a9758b::all_1_1138616_48:67, file segment info: None: (while reading column <column_name>): (while reading from part /var/lib/clickhouse/store/c48/c487a9b6-3d95-42e8-b168-cd9c45a9758b/all_1_1134741_47/ from mark 0 with max_rows_to_read = 8192): While executing MergeTreeSequentialSource. (CANNOT_READ_FROM_FILE_DESCRIPTOR) (version 23.3.1.2823 (official build))
Both tables are defined as ReplactingMergeTree, and the problematic column is the first column on both tables.
We can query the tables, insert data and delete without any problem.
We have run these commands many times in the past and never encountered this error.
We can't reproduce it on other large tables in the same environment or other environments.
How to reproduce
ClickHouse Version: 23.3.1.2823
Query: OPTIMIZE TABLE <table_name> FINAL
Expected behavior
Remove duplicated rows (take the most updated row)
The text was updated successfully, but these errors were encountered:
Describe what's wrong
Our system is sensitive to duplicated data, so we use
OPTIMIZE FINAL
on ReplacingMergeTree tables to remove duplications.When trying to run
OPTIMIZE FINAL
on 2 large tables (~2 billion tables & ~200 million) we get this error, on both tables:Orig exception: Code: 74. DB::ErrnoException: Cannot read from file: /var/lib/clickhouse/store/c48/c487a9b6-3d95-42e8-b168-cd9c45a9758b/all_1_1134741_47/<column_name>.bin, errno: 22, strerror: Invalid argument: Cache info: Buffer path: /var/lib/clickhouse/store/c48/c487a9b6-3d95-42e8-b168-cd9c45a9758b/all_1_1134741_47/<column_name>.bin, hash key: d8713dc8fbd955852c4cac2756e46d72, file_offset_of_buffer_end: 0, internal buffer remaining read range: [0:5554177], read_type: REMOTE_FS_READ_AND_PUT_IN_CACHE, last caller: c487a9b6-3d95-42e8-b168-cd9c45a9758b::all_1_1138616_48:67, file segment info: None: (while reading column <column_name>): (while reading from part /var/lib/clickhouse/store/c48/c487a9b6-3d95-42e8-b168-cd9c45a9758b/all_1_1134741_47/ from mark 0 with max_rows_to_read = 8192): While executing MergeTreeSequentialSource. (CANNOT_READ_FROM_FILE_DESCRIPTOR) (version 23.3.1.2823 (official build))
Both tables are defined as
ReplactingMergeTree
, and the problematic column is the first column on both tables.We can query the tables, insert data and delete without any problem.
We have run these commands many times in the past and never encountered this error.
We can't reproduce it on other large tables in the same environment or other environments.
How to reproduce
ClickHouse Version: 23.3.1.2823
Query:
OPTIMIZE TABLE <table_name> FINAL
Expected behavior
Remove duplicated rows (take the most updated row)
The text was updated successfully, but these errors were encountered: