-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Added support for Tensorflow2 strategy distributed training and Horovod AllToAll synchronous distributed training. #347
Conversation
d095bf1
to
debe847
Compare
6d75cf5
to
b43964f
Compare
In the pull request workflow, kindly suggest that never use a force-push, the reviewer has to review the whole branch all over again. No diff between versions of the branch, and bringing some unrelated-histories in the pull branch. When I worked for DeepRec, they suggested me reopen a clean PR without force-push. |
77bae4c
to
cefa36a
Compare
demo/dynamic_embedding/movielens-1m-keras-with-horovod/movielens-1m-keras-with-horovod.py
Show resolved
Hide resolved
tensorflow_recommenders_addons/dynamic_embedding/python/ops/math_ops.py
Outdated
Show resolved
Hide resolved
tensorflow_recommenders_addons/dynamic_embedding/python/kernel_tests/horovod_sync_train_test.py
Show resolved
Hide resolved
tensorflow_recommenders_addons/dynamic_embedding/python/ops/dynamic_embedding_ops.py
Show resolved
Hide resolved
tensorflow_recommenders_addons/dynamic_embedding/python/ops/dynamic_embedding_ops.py
Show resolved
Hide resolved
tensorflow_recommenders_addons/dynamic_embedding/python/ops/dynamic_embedding_optimizer.py
Show resolved
Hide resolved
tensorflow_recommenders_addons/dynamic_embedding/python/ops/dynamic_embedding_optimizer.py
Show resolved
Hide resolved
…d Horovod AllToAll synchronous distributed training. [fix] Fix some git auto merge mistake. And also fix to be compatible with latest Keras optimizer.
…rse.reshape has been able to replace it.
…rtions of the command lambda function in Redis backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Now we can simply run distributed training with TFRA by using TF strategy or Horovod in Keras API.
See the demo 'demo/dynamic_embedding/movielens-1m-keras-ps' and 'demo/dynamic_embedding/movielens-1m-keras-with-horovod'
Solving problem that lack of disk space when GitHub CI.
Also fix some bug.
Description
Brief Description of the PR:
Fixes # (issue)
Type of change
Checklist:
How Has This Been Tested?
Run new test and new demo.