Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] TensorFlow Distributed Trainable #11876

Merged
merged 7 commits into from
Nov 10, 2020
Merged

Conversation

oliverhu
Copy link
Member

@oliverhu oliverhu commented Nov 8, 2020

Why are these changes needed?

Follow on the discussion on https://github.com/ray-project/ray/discussions/11111. Tune currently supports distributed hyperparameter search with distributed training for (pytorch, horovod). We should also support Tensorflow default distributed training strategies including MultiWorkerMirrorStrategy and Parameter Serving Strategy.

Related issue number

https://github.com/ray-project/ray/discussions/11111

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@oliverhu oliverhu changed the title [tune] TensorFlow Keras Distributed Trainable [DRAFT] [tune] TensorFlow Distributed Trainable [DRAFT] Nov 8, 2020
@richardliaw richardliaw self-assigned this Nov 8, 2020
@oliverhu
Copy link
Member Author

oliverhu commented Nov 8, 2020

@oliverhu oliverhu changed the title [tune] TensorFlow Distributed Trainable [DRAFT] [tune] TensorFlow Distributed Trainable Nov 9, 2020
@richardliaw
Copy link
Contributor

Thanks a bunch for this PR!

Can you be sure to add the following:

@oliverhu
Copy link
Member Author

@richardliaw thanks for detailed comments.. updated

oliverhu and others added 2 commits November 9, 2020 20:24
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
@richardliaw richardliaw merged commit 0c1bdae into ray-project:master Nov 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants