Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multimodal] fix num_gpus #3070

Merged
merged 1 commit into from
Mar 29, 2023
Merged

[multimodal] fix num_gpus #3070

merged 1 commit into from
Mar 29, 2023

Conversation

liangfu
Copy link
Collaborator

@liangfu liangfu commented Mar 21, 2023

Issue #, if available:

num_gpus always become 1 when calling predictor.evaluate after fit, therefore prevent us from supporting multi-gpu.

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link

Job PR-3070-68fa0f3 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3070/68fa0f3/index.html

@@ -482,7 +482,7 @@ def predict(
if predictor._problem_type == OBJECT_DETECTION:
strategy = "ddp"

if strategy == "ddp" and predictor._fit_called:
if strategy == "ddp" and predictor._fit_called and predictor._problem_type == OBJECT_DETECTION:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested if it works for other problem type? While working on object detection task, the blocker is a cuda barrier which I think may not only affect the detection problem type

Copy link
Collaborator Author

@liangfu liangfu Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi-gpu works for other problem types, since we are using DP. I think @zhiqiangdon would have more detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi-gpu works for other problem types, since we are using DP. I think @zhiqiangdon would have more detail.

Here we try to allow using DDP for other problem types. Right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this single-line change isn't quite related to DDP yet, actually it would be DP for other problem types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if condition is about ddp. Wondering how it affects dp.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, this if condition is about ddp, but it should only affect OBJECT_DETECTION problem type, not other problem types.

it shouldn't affect dp for other problem types.

Copy link
Contributor

@zhiqiangdon zhiqiangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liangfu liangfu merged commit d5fc918 into autogluon:master Mar 29, 2023
@liangfu liangfu deleted the fix-numgpus-1 branch March 29, 2023 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants