-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add retry logic to Python boot script. #25473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This will allow runners to act more quickly on failures rather than wait for all workers to die before the exiting the container.
Codecov Report
@@ Coverage Diff @@
## master #25473 +/- ##
==========================================
- Coverage 72.96% 72.79% -0.17%
==========================================
Files 745 749 +4
Lines 99174 99543 +369
==========================================
+ Hits 72362 72465 +103
- Misses 25446 25712 +266
Partials 1366 1366
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
antonbobkov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
sdks/python/container/boot.go
Outdated
| // DoFns throwing exceptions. | ||
| errorCount += 1 | ||
| if errorCount < 4 { | ||
| log.Printf("Python (worker %v) exited: %v", workerId, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps add more information in the logs about retries/failures. Maybe something along the lines of "Python (worker %v) exited: %v, retrying process." and "Python (worker %v) exited: %v, out of retries: restarting SDK container."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
|
PTAL |
This will allow runners to act more quickly on failures rather than wait for all workers to die before the exiting the container.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.