-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel consumer stops processing data sometimes #606
Comments
We will investigate but we are planning to release a new version this week which will include metrics. Can you upgrade your version at that point and see if you can capture any more information that may be useful to triage? Thanks |
Looking forward for the new version. Metrics would definitely help us to investigate this. |
I am experiencing the same issue. Looking forward to the metrics update |
Hi, i also experiencing the same issue.. |
I also experience this issue and after some digging I find the root cause, I am trying to fix this on my end, maybe will create a PR for this later. |
Hi, i also experiencing the same issue too.. |
Closed by #623 |
Hi,
In production we noticed that the parallel consumer (0.5.2.5) sometimes stops processing data. It is a similar problem like #547.
We can not reproduce the problem but we noticed the following:
When looking into the fix for 547, I was wondering if the fix is complete?
The fix validates if a WorkContainer is stale and end the flight of the WorkContainer and changes
numberRecordsOutForProcessing
https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/WorkManager.java#L251
Are those WorkContainers also cleaned up from the ProcessingShard which keeps a set of WorkContainers and is used to fetch work?
https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java#L44
For the succes and failure case, the ShardManager is being called while for the stale case the ShardManager is not called:
https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/WorkManager.java#L152
https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/WorkManager.java#L173
Kind regards,
Bart
The text was updated successfully, but these errors were encountered: