Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default object detection handler works only on cuda:0 device in GPU machine #104

Closed
harshbafna opened this issue Mar 18, 2020 · 4 comments · Fixed by #265
Closed

Default object detection handler works only on cuda:0 device in GPU machine #104

harshbafna opened this issue Mar 18, 2020 · 4 comments · Fixed by #265

Comments

@harshbafna
Copy link
Contributor

Executing pre-trained FasterRCNN/MaskRCNN object detection modes using default object detection handler fails if GPU device other then cuda:0 is used.

Cuda:0 and CPU devices return similar response with minor floating point differences in the BB. However executing on different GPU devices like cuda:1, cuda:2 etc returns tensor/label/score for a single BB.

We created following ticket on PyTorch forum and still waiting for some guidelines to address this difference in behavior in multi-GPU environment for OD

https://discuss.pytorch.org/t/pytorch-different-output-on-different-cuda-device-for-fasterrcnn-maskrcnn/71867/3

For the time being the OD default handler is hard coded to use cuda:0 device in GPU environment

@harshbafna harshbafna added the bug Something isn't working label Mar 18, 2020
@harshbafna harshbafna self-assigned this Mar 18, 2020
@harshbafna
Copy link
Contributor Author

Created following issue in TorchVision : pytorch/vision#1993

@fmassa
Copy link
Member

fmassa commented Mar 19, 2020

Just answered in the torchvision issue, I was unable to reproduce the issue with the provided snippet, using pytorch and torchvision nightlies

@harshbafna harshbafna removed the bug Something isn't working label Mar 20, 2020
@harshbafna
Copy link
Contributor Author

harshbafna commented Mar 20, 2020

As indicated in this comment, a new version of PyTorch / torchvision will be released in the next following weeks.

Till then the OD default handler will run on cuda:0 device only. The default handler will be updated once the fix is available in the stable release.

@dhaniram-kshirsagar
Copy link
Contributor

Related PR has been merged hence closing this.

@maaquib maaquib added this to To do in TorchServe v0.2.0 Issues Lifecycle via automation Jul 2, 2020
@maaquib maaquib moved this from To do to Verified (close after merge) in TorchServe v0.2.0 Issues Lifecycle Jul 2, 2020
@maaquib maaquib moved this from Verified (close after merge) to Done in TorchServe v0.2.0 Issues Lifecycle Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
4 participants