You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because the CI runs into flakiness problems with distributed, I am disabling the call to torchrun in #452 , it would be neat to re-enable once we know what is going on.
Based on analysis of the build-logs, the problem seems to be connected to lambda-server1 , but it could also be some other but correlated thing. (This is off the 186 runs I got from the Azure API this morning.)
Because the CI runs into flakiness problems with distributed, I am disabling the call to torchrun in #452 , it would be neat to re-enable once we know what is going on.
Based on analysis of the build-logs, the problem seems to be connected to lambda-server1 , but it could also be some other but correlated thing. (This is off the 186 runs I got from the Azure API this morning.)
cc @Borda
The text was updated successfully, but these errors were encountered: