New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could not start gRPC server
flakiness in XLA tests
#77808
Comments
cc @JackCaoG can you have someone take a look |
Hmm, I had pytorch/xla@935b602 which suppose to make this situation slightly better but it got reverted due to some CI issue. Let me try to reland this change. |
Another example from today on master https://github.com/pytorch/pytorch/runs/6615801243?check_suite_focus=true |
Hmm, I merge the fix pytorch/xla#3605 and pytorch pr #78327 but I see that now it is a different test trigger this failure. I will do some further clean up |
pytorch/xla#3615 should help a bit more. Let me know if you still see this kind of error in pytorch CI. |
Another recent failure on master https://github.com/pytorch/pytorch/runs/6659602672?check_suite_focus=true |
another recent failure on master; this one is not |
I think those failure are not recent, if they rebase their pytorch branch, mp test should not being run on CI. |
This failure was on the master branch :) Maybe we didn't update the pin in a while? |
I think so, I did not see the call of |
For some examples, see here.
Can we add some retries or something to this test?
cc @bdhirsh
The text was updated successfully, but these errors were encountered: