-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(multiprocessing): Reset pool if tasks are not completed #315
Conversation
If multiprocessing tasks are not completed within the timeout specified, we need to reset the pool to avoid state being carried over between assignments. This change also defers the initial creation of the multiprocessing pool until the first message is received, which moves it out of the assignment callback (which needs to be fast).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this change also needs a test. i am also curious if we can reproduce locally whether running into the join timeout "poisons" the pool.
also please add a counter metric when the pool is being recreated, i imagine that with this change, we will have to fine-tune join timeout again to improve consumer performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving since we can't figure out a repro case locally right now, and the test I proposed already exists.
If multiprocessing tasks are not completed within the timeout specified, we need to reset the pool to avoid state being carried over between assignments.