Affected Version
0.18.1——0.22.1
Description
Due to the large amount of data in the production environment, our kafka cluster had to use a single-replica topic. When a kafka node goes down, the kafka indexing task cannot be started. The normal running Supervisor can still run continuously, but after the reset operation, it can't run either.
If this happens in the production environment, and the kafka node is down and cannot be recovered in a short time, how can the Druid task increase the reliability of it?
The following is a screenshot of my test. The error message is: 'Timeout of 60000ms expired before the position for partition topic-0 could be determined'.After a while, the Supervisors state changed to 'LOST_CONTACT_WITH_STREAM'.

![Uploading image.png…]()
Thank you!