Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed distributed strategy registration to be explicit #3361

Merged
merged 3 commits into from
Apr 22, 2023
Merged

Conversation

tgaddair
Copy link
Collaborator

Previously were seeing issues were DDP was getting picked up even though FSDP was being selected.

@github-actions
Copy link

github-actions bot commented Apr 20, 2023

Unit Test Results

    6 files  ±    0      6 suites  ±0   1h 59m 36s ⏱️ + 1h 4m 19s
194 tests +161  164 ✔️ +135  30 💤 +26  0 ±0 
226 runs  +127  187 ✔️ +100  39 💤 +27  0 ±0 

Results for commit 9076aca. ± Comparison against base commit dcca88c.

♻️ This comment has been updated with latest results.

@tgaddair tgaddair merged commit 9e3a98f into master Apr 22, 2023
8 of 10 checks passed
@tgaddair tgaddair deleted the fix-fsdp branch April 22, 2023 03:02
@@ -244,7 +245,7 @@ def tune_batch_size_fn(
# inside of a RayTrainer class, so we manually set the local_rank to 0 so that it picks up the right
# device to tune batch size on.
initialize_pytorch(local_rank=0, local_size=_local_size())
distributed = get_current_dist_strategy(allow_local=True)()
distributed = init_dist_strategy("local")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this still be local after having merged #2934?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch, we resolved this when merging the distirubted batch size tuning pr, so should be good:

https://github.com/ludwig-ai/ludwig/blob/master/ludwig/backend/ray.py#L249

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants