When a kafka node goes down in single-replica topic mode, the Kafka indexing task cannot be started

### Affected Version
0.18.1——0.22.1

### Description
Due to the large amount of data in the production environment, our kafka cluster had to use a single-replica topic. When a kafka node goes down, the kafka indexing task cannot be started. The normal running Supervisor can still run continuously, but after the reset operation, it can't run either.

If this happens in the production environment, and the kafka node is down and cannot be recovered in a short time, how can the Druid task increase the reliability of it?

The following is a screenshot of my test. The error message is: 'Timeout of 60000ms expired before the position for partition topic-0 could be determined'.After a while, the Supervisors state changed to 'LOST_CONTACT_WITH_STREAM'.
![image](https://user-images.githubusercontent.com/41256589/161210254-6b53cf2e-e43f-4472-9a89-328ce45a1529.png)
![Uploading image.png…]()

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When a kafka node goes down in single-replica topic mode, the Kafka indexing task cannot be started #12385

Affected Version

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

When a kafka node goes down in single-replica topic mode, the Kafka indexing task cannot be started #12385

Description

Affected Version

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions