Skip to content

[SUPPORT] kafka connect to Hudi - getting Transaction Participant doesn't exist error while trying to add Hudi-Sink to the connector #8712

@yashps99

Description

@yashps99

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

What could be the possible reason for this issue I'm getting while trying to add the hudi-sink to the connector?

To Reproduce

Steps to reproduce the behavior:

  1. Zookeeper and Kafka running on separate cluster
  2. Create the hudi-control-topic and hudi-test-topic
  3. Go to the kafka home directory and run:./bin/connect-distributed.sh $HUDI_DIR/connect-distributed.properties
    4.Initiate a CONNECT request and execute:curl -X POST -H "Content-Type:application/json" -d @HUDI_DIR/config-sink.json http://localhost:8083/connectors

Environment Description

  • Hudi version : 0.9.0

  • Spark version : 2.4.8

  • Hive version : 2.3.8

  • Hadoop version : 2.10.1

  • Storage (HDFS/S3/GCS..) : S3

  • Running on Docker? (yes/no) : no

note
1、connect-distributed.properties configuration:

bootstrap.servers=xx.xx.xx.xx:9092
group.id=hudi-connect-cluster
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.topic=connect-offsets
offset.storage.replication.factor=1
config.storage.topic=connect-configs
config.storage.replication.factor=1
status.storage.topic=connect-status
status.storage.replication.factor=1

offset.flush.interval.ms=60000
listeners=HTTP://:8083
plugin.path=/usr/local/share/kafka/plugins

2、config-sink-test.json configuration:

{
"name": "hudi-test-topic",
"config": {
"bootstrap.servers": xx.xx.xx.xx:9092",
"connector.class": "org.apache.hudi.connect.HoodieSinkConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter.schemas.enable": "false",
"topics": "hudi-test-topic",
"hoodie.table.name": "test_hudi_table",
"hoodie.table.type": "MERGE_ON_READ",
"hoodie.base.path": "s3a://",
"hoodie.datasource.write.partitionpath.field": "date",
"hoodie.datasource.write.recordkey.field": "volume",
"hoodie.kafka.commit.interval.secs": 60
}
}

Stacktrace

[2023-05-15 14:40:53,244] ERROR WorkerSinkTask{id=hudi-sink-0} RetriableException from SinkTask: (org.apache.kafka.connect.runtime.WorkerSinkTask:601)
org.apache.kafka.connect.errors.RetriableException: TransactionParticipant should be created for each assigned partition, but has not been created for the topic/partition: hudi-test-topic:0
	at org.apache.hudi.connect.HoodieSinkTask.put(HoodieSinkTask.java:111)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:582)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:330)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201)
	at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188)
	at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:237)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Refer to the link:
https://github.com/apache/hudi/tree/master/hudi-kafka-connect

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    👤 User Action

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions