Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify distributed training #864

Closed
futurely opened this issue Sep 19, 2019 · 3 comments
Closed

Simplify distributed training #864

futurely opened this issue Sep 19, 2019 · 3 comments

Comments

@futurely
Copy link

馃殌 Feature

Simplify distributed training so that users do not have to manually setup graph store, sampler and kvstore which is inefficient for development and error prone.

Motivation

The only difference between distributed and non-distributed training in PyTorch-BigGraph is adding an command line argument "--rank rank" and a few more configs.

The training script automatically handle the two situations.

Euler requires knowing the hosts in the cluster. But the training script is also very concise to run distributed training.

Pitch

The framework transparently starts distributed training so that users can use it with minimal effort and without worrying about the underlying details.

@zheng-da
Copy link
Collaborator

I totally agree with you that the training script should set up everything for distributed training. It's also in our plan.

@github-actions
Copy link

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

@github-actions
Copy link

This issue is closed due to lack of activity. Feel free to reopen it if you still have questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants