Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error when creating a table with a Kafka table engine #40238

Open
a-dot opened this issue Aug 15, 2022 · 3 comments
Open

Fatal error when creating a table with a Kafka table engine #40238

a-dot opened this issue Aug 15, 2022 · 3 comments
Assignees
Labels
comp-kafka Kafka Engine potential bug To be reviewed by developers and confirmed/rejected.

Comments

@a-dot
Copy link

a-dot commented Aug 15, 2022

Describe what's wrong

When creating a table using a Kafka table engine I get this failed assert / stack trace and ClickHouse dies.

*** ../contrib/librdkafka/src/rdkafka_request.c:219:rd_kafka_buf_read_topic_partitions: assert: rkbuf->rkbuf_rkb ***
[ 259 ] {} <Fatal> BaseDaemon: #####################################
[ 259 ] {} <Fatal> BaseDaemon: (version 22.7.35 (official build), build id: A97EE8D81E41A58E) (from thread 243) (no query) Received signal Aborted (6)
[ 259 ] {} <Fatal> BaseDaemon: 
[ 259 ] {} <Fatal> BaseDaemon: Stack trace: 0x7f09530df00b 0x7f09530be859 0x1a609a3d 0x1a6c7fdf 0x1a63ee67 0x1a63c7f4 0x1a606814 0x1a6adce1 0x1a6c1895 0x1a60c838 0x1a71d6b9 0x7f0953296609 0x7f09531bb133
[ 259 ] {} <Fatal> BaseDaemon: 2. raise @ 0x7f09530df00b in ?
[ 259 ] {} <Fatal> BaseDaemon: 3. abort @ 0x7f09530be859 in ?
[ 259 ] {} <Fatal> BaseDaemon: 4. ? @ 0x1a609a3d in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 5. rd_kafka_buf_read_topic_partitions @ 0x1a6c7fdf in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 6. ? @ 0x1a63ee67 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 7. ? @ 0x1a63c7f4 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 8. rd_kafka_buf_callback @ 0x1a606814 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 9. rd_kafka_op_handle @ 0x1a6adce1 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 10. rd_kafka_q_serve @ 0x1a6c1895 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 11. ? @ 0x1a60c838 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 12. ? @ 0x1a71d6b9 in /usr/bin/clickhouse
[ 259 ] {} <Fatal> BaseDaemon: 13. ? @ 0x7f0953296609 in ?
[ 259 ] {} <Fatal> BaseDaemon: 14. __clone @ 0x7f09531bb133 in ?
[ 259 ] {} <Fatal> BaseDaemon: Integrity check of the executable successfully passed (checksum: 35C037103309C93E3EF6D59CA6A2CF05)

Does it reproduce on recent release?

Yes

How to reproduce

  • Which ClickHouse server version to use: any
  • CREATE TABLE statements for all tables involved: any kafka table with default values

Expected behavior

Not a fatal error.

Error message and/or stacktrace

See above.

Additional context

That seems to be a known bug in librdkafka >1.4.2 & <1.9.0 and I believe you use 1.5.0 ?

See these links:
confluentinc/librdkafka#3223
confluentinc/confluent-kafka-go#650 (comment)

@a-dot a-dot added the potential bug To be reviewed by developers and confirmed/rejected. label Aug 15, 2022
@filimonov filimonov added the comp-kafka Kafka Engine label Sep 9, 2022
@filimonov
Copy link
Contributor

CREATE TABLE statements for all tables involved: any kafka table with default values

It is not reproducible with ANY kafka table :)

Most probably it's related to some specific broker behaviour ("malformed JoinGroupResponse consumer group metadata"), i did not dig it deeper, but the links you added seems relevant, the fix should be there:
https://github.com/edenhill/librdkafka/pull/3678/files

I'll try to either upgrade librdkafka (there was some issues with opessl there) or will just backport the fix

@a-dot
Copy link
Author

a-dot commented Sep 9, 2022

Thank you for following up. I was able to narrow this crash down to only a specific broker like you said. However, I don't have access to the logs on the broker to provide more details.

While doing more troubleshooting on this particular broker I noticed that simply changing the consumer group name would resolve the issue and let the Kafka Table start up and consume normally. That lines up with what you suspect to be the issue I think.

@donge
Copy link
Contributor

donge commented Apr 16, 2024

this issue can be reproduced in 24.3.2.23.

use-server/server.key: error:02000002:system library:OPENSSL_internal:No such file or directory (version 24.3.2.23 (official build))
*** contrib/librdkafka/src/rdkafka_request.c:219:rd_kafka_buf_read_topic_partitions: assert: rkbuf->rkbuf_rkb ***
2024.04.16 15:58:43.608870 [ 754 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2024.04.16 15:58:43.608905 [ 754 ] {} <Fatal> BaseDaemon: (version 24.3.2.23 (official build), build id: 5B5C43F049E2D5125E894547577CDA8EEC25B3C7, git hash: 8b7d910960cc2c6a0db07991fe2576a67fe98146) (from thread 747) Received signal 6
2024.04.16 15:58:43.608916 [ 754 ] {} <Fatal> BaseDaemon: Signal description: Aborted
2024.04.16 15:58:43.608920 [ 754 ] {} <Fatal> BaseDaemon:
2024.04.16 15:58:43.608935 [ 754 ] {} <Fatal> BaseDaemon: Stack trace: 0x0000ffffb5471d78 0x0000ffffb545eaac 0x0000aaaae11b0f50 0x0000aaaae1233f60 0x0000aaaae11d11c8 0x0000aaaae11cefac 0x0000aaaae11ae310 0x0000aaaae1221b84 0x0000aaaae122f9b4 0x0000aaaae11b3938 0x0000aaaae126d720 0x0000ffffb55b8624 0x0000ffffb550f62c
2024.04.16 15:58:43.608946 [ 754 ] {} <Fatal> BaseDaemon: ########################################
2024.04.16 15:58:43.608953 [ 754 ] {} <Fatal> BaseDaemon: (version 24.3.2.23 (official build), build id: 5B5C43F049E2D5125E894547577CDA8EEC25B3C7, git hash: 8b7d910960cc2c6a0db07991fe2576a67fe98146) (from thread 747) (no query) Received signal Aborted (6)
2024.04.16 15:58:43.608956 [ 754 ] {} <Fatal> BaseDaemon:
2024.04.16 15:58:43.608960 [ 754 ] {} <Fatal> BaseDaemon: Stack trace: 0x0000ffffb5471d78 0x0000ffffb545eaac 0x0000aaaae11b0f50 0x0000aaaae1233f60 0x0000aaaae11d11c8 0x0000aaaae11cefac 0x0000aaaae11ae310 0x0000aaaae1221b84 0x0000aaaae122f9b4 0x0000aaaae11b3938 0x0000aaaae126d720 0x0000ffffb55b8624 0x0000ffffb550f62c
2024.04.16 15:58:43.609011 [ 754 ] {} <Fatal> BaseDaemon: 2. ? @ 0x0000000000033d78
2024.04.16 15:58:43.609021 [ 754 ] {} <Fatal> BaseDaemon: 3. ? @ 0x0000000000020aac
2024.04.16 15:58:43.609042 [ 754 ] {} <Fatal> BaseDaemon: 4. ? @ 0x0000000012c70f50
2024.04.16 15:58:43.609065 [ 754 ] {} <Fatal> BaseDaemon: 5. rd_kafka_buf_read_topic_partitions @ 0x0000000012cf3f60
2024.04.16 15:58:43.609097 [ 754 ] {} <Fatal> BaseDaemon: 6. rd_kafka_group_MemberMetadata_consumer_read @ 0x0000000012c911c8
2024.04.16 15:58:43.609107 [ 754 ] {} <Fatal> BaseDaemon: 7. rd_kafka_cgrp_handle_JoinGroup @ 0x0000000012c8efac
2024.04.16 15:58:43.609116 [ 754 ] {} <Fatal> BaseDaemon: 8. rd_kafka_buf_callback @ 0x0000000012c6e310
2024.04.16 15:58:43.609138 [ 754 ] {} <Fatal> BaseDaemon: 9. rd_kafka_op_handle @ 0x0000000012ce1b84
2024.04.16 15:58:43.609162 [ 754 ] {} <Fatal> BaseDaemon: 10. rd_kafka_q_serve @ 0x0000000012cef9b4
2024.04.16 15:58:43.609174 [ 754 ] {} <Fatal> BaseDaemon: 11. rd_kafka_thread_main @ 0x0000000012c73938
2024.04.16 15:58:43.609199 [ 754 ] {} <Fatal> BaseDaemon: 12. _thrd_wrapper_function.llvm.4665451034283701737 @ 0x0000000012d2d720
2024.04.16 15:58:43.609227 [ 754 ] {} <Fatal> BaseDaemon: 13. start_thread @ 0x0000000000007624
2024.04.16 15:58:43.609233 [ 754 ] {} <Fatal> BaseDaemon: 14. ? @ 0x00000000000d162c
2024.04.16 15:58:43.609237 [ 754 ] {} <Fatal> BaseDaemon: Integrity check of the executable skipped because the reference checksum could not be read.
2024.04.16 15:58:43.609248 [ 754 ] {} <Fatal> BaseDaemon: Report this error to https://github.com/ClickHouse/ClickHouse/issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp-kafka Kafka Engine potential bug To be reviewed by developers and confirmed/rejected.
Projects
None yet
Development

No branches or pull requests

3 participants