-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#10314 Better worker-side error handling in disttrial #11644
#10314 Better worker-side error handling in disttrial #11644
Conversation
please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a very quick review.
I am leaving the command as I don't know when I will have time to do a full review.
If no other developer will have time to review this in the next few days, I think that this should be merge.
I don't think is ok to block a merge for a PR that has tests for the new code, is well documented and the tests are green.
8d0411e
to
5cc7599
Compare
This avoids a protocol error when the string representations exceed the length limits imposed by the AMP protocol disttrial uses.
5cc7599
to
5811136
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good. Only minor comments.
It's still on a class in a private package so this doesn't make it public to users outside of Twisted (or even outside of disttrial). It does make it public to `WorkerProtocol` which needs to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few other minor comments.
I still have to review - _runErrorTest
and WorkerProtocol.run
I will try to continue the review later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Only a medium comment regarding the usage of context manager.
But I think it can be fixed and merged without another round of reviews.
All reporting results must have happened by the time `suite.run` returns so there's no reason to process them inside the context manager. And processing them in the context manager means that if any code somehow starts a _new_ reporting operation the list we passed to DeferredList will be mutated (and presumably then ignore). Ending the context manager turns that case into a hard error, instead.
Scope and purpose
Fixes #10314
This adds error handling (any, at all!) to the "workerreporter" that manages communication of test results from the disttrial worker processes back to the parent process so that such errors won't cause the whole trial process to exit without a result report.
It also changes the network protocol between workers and parent to stream error text so that text longer than the AMP limit of 64k makes it back to the parent without error.