Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc panic when kafka sink rolling restart #9023

Closed
fubinzh opened this issue May 23, 2023 · 3 comments · Fixed by #9026
Closed

cdc panic when kafka sink rolling restart #9023

fubinzh opened this issue May 23, 2023 · 3 comments · Fixed by #9026
Labels
affects-6.5 affects-7.1 area/ticdc Issues or PRs related to TiCDC. severity/critical This is a critical bug. type/bug This is a bug.

Comments

@fubinzh
Copy link

fubinzh commented May 23, 2023

What did you do?

  1. TiDB cluster with 3 CDC deployed (32C 64G each)
  2. There are 2 kafka changefeed, one for single big table, the other for 4k small tables. the lag is normal for both changefeed
  3. rolling restart the kafka sink (3 instances)

What did you expect to see?

cdc should not panic

What did you see instead?

cdc panic seen

[root@bogon bigCluster]# kubectl  --kubeconfig kubeconfig.yml -n cdc-kafka-big-cluster-tps-1712340-1-428 logs -p tc-ticdc-2
[WARN] TiCDC server data-dir is not set. Please use `cdc server --data-dir` to start the cdc server if possible.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x12b2609]

goroutine 74332 [running]:
github.com/Shopify/sarama.(*partitionProducer).newHighWatermark(0xc0c18b9140, 0x3)
        github.com/Shopify/sarama@v1.36.0/async_producer.go:620 +0x1a9
github.com/Shopify/sarama.(*partitionProducer).dispatch(0xc0c18b9140)
        github.com/Shopify/sarama@v1.36.0/async_producer.go:564 +0x537
github.com/Shopify/sarama.withRecover(0xc1ea43e580?)
        github.com/Shopify/sarama@v1.36.0/utils.go:43 +0x3e
created by github.com/Shopify/sarama.(*asyncProducer).newPartitionProducer
        github.com/Shopify/sarama@v1.36.0/async_producer.go:515 +0x1ea

Versions of the cluster

cdc version: #9010

Release Version: v7.1.0
Git Commit Hash: 9b1497c7fba1d290443011f1d7d1e4305a125e1d
Git Branch: heads/refs/tags/v7.1.0
UTC Build Time: 2023-05-22 10:16:54
Go Version: go version go1.20.3 linux/amd64
Failpoint Build: false
@fubinzh fubinzh added area/ticdc Issues or PRs related to TiCDC. type/bug This is a bug. labels May 23, 2023
@github-actions github-actions bot added this to Need Triage in Question and Bug Reports May 23, 2023
@asddongmen asddongmen added the severity/critical This is a critical bug. label May 23, 2023
@asddongmen asddongmen added severity/major This is a major bug. and removed severity/critical This is a critical bug. labels May 23, 2023
@asddongmen
Copy link
Contributor

This issue is a sarama bug: IBM/sarama#2322
It was fixed by: IBM/sarama@2379257
We need to bump cdc's sarama dependency to 1.38.1 to avoid this issue.

@nongfushanquan
Copy link
Contributor

/severity critical

@nongfushanquan
Copy link
Contributor

/remove-label affects-6.1

@ti-chi-bot ti-chi-bot bot removed the affects-6.1 label Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 area/ticdc Issues or PRs related to TiCDC. severity/critical This is a critical bug. type/bug This is a bug.
Development

Successfully merging a pull request may close this issue.

3 participants