Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse crash/data inconsistency if column and nested table have same name #36146

Closed
dmitriy-myz opened this issue Apr 11, 2022 · 6 comments
Closed
Assignees
Labels
bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort

Comments

@dmitriy-myz
Copy link
Contributor

You have to provide the following information whenever possible.

Describe what's wrong
If one create table containing column and nested table with same name unexpected thing will happened.

Does it reproduce on recent release?
Yes

How to reproduce

  • Which ClickHouse server version to use
    starting from 21.9. Reproduces on 22.3.3.44
  • CREATE TABLE statements for all tables involved
CREATE TABLE tst
(   
    `id` Int,
    `col` Array(String),
    `col.s` Array(LowCardinality(String)),
    `col.u` Array(LowCardinality(String))
)
ENGINE = ReplacingMergeTree(id)
PARTITION by tuple()
ORDER BY id
  • Queries to run that lead to unexpected result
-- populate data in compact format
insert into tst (id, `col`, `col.s`, `col.u`) select number, ['a','b', 'c', 'd'],[],[] from system.numbers limit 1
select * from tst limit 20
-- so far all ok
-- populate data in normal file format (> 10M) size
insert into tst (id, `col`, `col.s`, `col.u`) select number, ['a','b', 'c', 'd'],[],[] from system.numbers limit 1000000
select * from tst limit 20
-- server will crash
-- if not run
OPTIMIZE TABLE tst

Expected behavior
should not crash / should have consistent data

Error message and/or stacktrace

SELECT *
FROM tst
LIMIT 20

Query id: 5098c916-d295-4ac1-b275-c8088c2ba8bc

┌─id─┬─col───────────────┬─col.s─┬─col.u─┐
│  0 │ ['a','b','c','d'] │ []    │ []    │
└────┴───────────────────┴───────┴───────┘
[74acc568fb82] 2022.04.11 17:31:00.918886 [ 255 ] <Fatal> BaseDaemon: ########################################
[74acc568fb82] 2022.04.11 17:31:00.918989 [ 255 ] <Fatal> BaseDaemon: (version 22.3.3.44 (official build), build id: F9D3C2B8666BEF5D) (from thread 251) (query_id: 5098c916-d295-4ac1-b275-c8088c2ba8bc) (query: select * from tst limit 20) Received signal Segmentation fault (11)
[74acc568fb82] 2022.04.11 17:31:00.919052 [ 255 ] <Fatal> BaseDaemon: Address: 0x11 Access: read. Address not mapped to object.
[74acc568fb82] 2022.04.11 17:31:00.919103 [ 255 ] <Fatal> BaseDaemon: Stack trace: 0x14c81fc2 0x7f6d89d163c0
[74acc568fb82] 2022.04.11 17:31:00.919165 [ 255 ] <Fatal> BaseDaemon: 0. ? @ 0x14c81fc2 in /usr/bin/clickhouse
[74acc568fb82] 2022.04.11 17:31:00.919234 [ 255 ] <Fatal> BaseDaemon: 1. ? @ 0x7f6d89d163c0 in ?
[74acc568fb82] 2022.04.11 17:31:01.067559 [ 255 ] <Fatal> BaseDaemon: Calculated checksum of the binary: FD20C2FC24F8B8996C15BF97FA841B03. There is no information about the reference checksum.
Exception on client:
Code: 32. DB::Exception: Attempt to read after eof: while receiving packet from localhost:9000. (ATTEMPT_TO_READ_AFTER_EOF)

Connecting to localhost:9000 as user default.

Additional context

This happened after upgrade from old version where those queries works fine. Behavior starts to reproduce from 21.9 (as far as I can debug it)

other issue with same roots (data currupt/ reading outside of buffer?):

CREATE TABLE tst
(
    `id` Int,
    `col` Array(String),
    `col.s` Array(LowCardinality(String)),
    `col.u` Array(LowCardinality(String))
)
ENGINE = ReplacingMergeTree(id)
PARTITION BY tuple()
ORDER BY id;

insert into tst (id, `col`, `col.s`, `col.u`) select number, ['a','b', 'c', 'd'],[number],[number] from system.numbers limit 1

insert into tst (id, `col`, `col.s`, `col.u`) select number, ['a','b', 'c', 'd'],[number],[number] from system.numbers limit 1000000
select * from tst limit 3
SELECT *
FROM tst
LIMIT 3

Query id: 436ab97d-cdff-445a-9f91-2a76d09082b4

┌─id─┬─col───────────────┬─col.s─┬─col.u─┐
│  0 │ ['a','b','c','d'] │ ['0'] │ ['0'] │
└────┴───────────────────┴───────┴───────┘
┌─id─┬─col───────────────────────────────────────────────────────────────┬─col.s─────────────┬─col.u─────────────┐
│  0 │ ['(\'0\',\'0\')','(\'1\',\'1\')','(\'2\',\'2\')','(\'3\',\'3\')'] │ ['0','1','2','3'] │ ['0','1','2','3'] │
│  1 │ ['(\'4\',\'4\')','(\'5\',\'5\')','(\'6\',\'6\')','(\'7\',\'7\')'] │ ['4','5','6','7'] │ ['4','5','6','7'] │
└────┴───────────────────────────────────────────────────────────────────┴───────────────────┴───────────────────┘

3 rows in set. Elapsed: 0.006 sec. 

optimize table tst

OPTIMIZE TABLE tst

Query id: 1ec099aa-9832-4836-b1b2-81cf15cd5702


0 rows in set. Elapsed: 0.865 sec. 

Received exception from server (version 22.3.3):
Code: 33. DB::Exception: Received from localhost:9000. DB::Exception: Cannot read all array values: read just 16960 of 32768: (while reading column col.s): (while reading from part /var/lib/clickhouse/store/0de/0de68989-9ddc-4c33-a925-65ca8f278f86/all_2_2_0/ from mark 15 with max_rows_to_read = 8192): While executing MergeTreeSequentialSource. (CANNOT_READ_ALL_DATA)
@dmitriy-myz dmitriy-myz added the potential bug To be reviewed by developers and confirmed/rejected. label Apr 11, 2022
@den-crane den-crane added bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort major and removed potential bug To be reviewed by developers and confirmed/rejected. labels Apr 11, 2022
@CurtizJ CurtizJ self-assigned this Apr 12, 2022
@alexey-milovidov alexey-milovidov changed the title Clickhouse crash/data inconsitence if column and nested table have same name Clickhouse crash/data inconsistence if column and nested table have same name Apr 12, 2022
@CurtizJ
Copy link
Member

CurtizJ commented Apr 14, 2022

Has it ever really worked?

The table definition is wrong, because for parts of Nested col.s and col.u there should be written one shared file with offsets col.size0. But for single column col there should be written file with offsets with the same name col.size0, so offsets will be overwritten and will be inconsistent.

So, this definition:

CREATE TABLE tst
(
    `id` Int,
    `col` Array(String),
    `col.s` Array(LowCardinality(String)),
    `col.u` Array(LowCardinality(String))
)
ENGINE = ReplacingMergeTree(id)
PARTITION BY tuple()
ORDER BY id;

is the same as the following, but in flattened form:

CREATE TABLE tst
(
    `id` Int,
    `col` Array(String),
    `col` Nested(s LowCardinality(String), u LowCardinality(String))
)
ENGINE = ReplacingMergeTree(id)
PARTITION BY tuple()
ORDER BY id;

The last one is obviously wrong.

@CurtizJ
Copy link
Member

CurtizJ commented Apr 14, 2022

Yes, it has never worked. I've tried 21.3, 20.3 and 19.3 and your example leads to server crash or reading of wrong data on all of the versions.

Just need to check column names on table creation and forbid such ambiguous names.

@dmitriy-myz
Copy link
Contributor Author

dmitriy-myz commented Apr 14, 2022

In our case we hade two subcolumns depth
This has worked on 21.3 (fresh data server)

docker run  --rm -it docker.io/clickhouse/clickhouse-server:21.3

docker exec -it clickhouse clickhouse-client
CREATE TABLE tst
(
    `id` Int,
    `col` Array(String),
    `col.s.a` Array(LowCardinality(String)),
    `col.s.b` Array(LowCardinality(String))
)
ENGINE = ReplacingMergeTree(id)
PARTITION BY tuple()
ORDER BY id;

INSERT INTO tst (id, col, `col.s.a`, `col.s.b`) SELECT
    number,
    ['a', 'b', 'c', 'd'],
    [number],
    [number]
FROM system.numbers
LIMIT 1;


INSERT INTO tst (id, col, `col.s.a`, `col.s.b`) SELECT
    number,
    ['a', 'b', 'c', 'd'],
    [number],
    [number]
FROM system.numbers
LIMIT 1000000;

SELECT *
FROM tst
LIMIT 3;

result:

┌─id─┬─col───────────────┬─col.s.a─┬─col.s.b─┐
│  0 │ ['a','b','c','d'] │ ['0']   │ ['0']   │
└────┴───────────────────┴─────────┴─────────┘
┌─id─┬─col───────────────┬─col.s.a─┬─col.s.b─┐
│  0 │ ['a','b','c','d'] │ ['0']   │ ['0']   │
│  1 │ ['a','b','c','d'] │ ['1']   │ ['1']   │
└────┴───────────────────┴─────────┴─────────┘

But in fresh version this led to crash.
BTW I don't mind if you just forbid creating such tables (we will recreate with correct name)

@den-crane den-crane removed the major label Apr 14, 2022
@alexey-milovidov
Copy link
Member

Does not crash in the new version: https://fiddle.clickhouse.com/35719b01-c5aa-45e3-a830-34b57fe4661f

@CurtizJ
Copy link
Member

CurtizJ commented Jan 5, 2023

The issue is still relevant.

@CurtizJ CurtizJ reopened this Jan 5, 2023
@alexey-milovidov alexey-milovidov removed their assignment Feb 26, 2023
@alexey-milovidov alexey-milovidov changed the title Clickhouse crash/data inconsistence if column and nested table have same name Clickhouse crash/data inconsistency if column and nested table have same name Mar 15, 2023
@CurtizJ
Copy link
Member

CurtizJ commented Sep 28, 2023

This is fixed as a side effect of #50612. Now it's not allowed to create such tables that may have collisions in names of column streams.

@CurtizJ CurtizJ closed this as completed Sep 28, 2023
@CurtizJ CurtizJ self-assigned this Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants