You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm a bit confused on how to run multi-node inference at the end of the training. I'm using deepspeed zero 3. What I have now is that at the end of training, I let each process output their outputs to local. And the local main process will aggregate the results and write to each node.
But I wonder is there anyway I can gather all the results across all the nodes?
Thank you
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hi, I'm a bit confused on how to run multi-node inference at the end of the training. I'm using deepspeed zero 3. What I have now is that at the end of training, I let each process output their outputs to local. And the local main process will aggregate the results and write to each node.
But I wonder is there anyway I can gather all the results across all the nodes?
Thank you
The text was updated successfully, but these errors were encountered: