-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for device selection and multiple GPUs #121
Conversation
In case you decide not to accept this PR, you should at least fix this very minor memory leak. popsift/src/popsift/sift_pyramid.cu Line 215 in 290e142
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not certain if the thread_local can fail for anybody. It can probably not fail for hct, hbuf and so on because those are only used in a thread spawned by PopSift.
I understand that the thread_local forces you to init the filter and the configuration in the extraction/match threads. That leads to more frequent configuration calls, is that problematic?
src/popsift/popsift.cpp
Outdated
@@ -313,14 +339,21 @@ void PopSift::extractDownloadLoop( ) | |||
|
|||
job->setFeatures( features ); | |||
} | |||
|
|||
private_unit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correct that you want to delete the Pyramid every time you have downloaded the features? That would crash if several images have been queued for feature extraction, wouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pyramid is deleted only after pipeline stops using PopSift::uninit
function. Note that private_unit
(this is actually a typo and should be private_uninit
) is called outside while loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you add private_uninit also to the matchPrepareLoop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right.
{ | ||
cudaSetDevice(_device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good thing to do. Perhaps with an error check when a user chooses an non-existant device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Do you think using device_prop_t::set(int, bool)
would be good? Another solution would be manual check with
POP_CUDA_FATAL_TEST( cudaSetDevice( currentDevice ), "Cannot set device" );
or maybe just POP_CHK
.
Filter function checks if configuration differs so I don't expect it to call any CUDA functions. Second
|
I'm sorry I don't quite understand what it is you want to say here. |
For more robust interface maybe there should be another call to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry that it took me so long before I could review the code. I hope that your improvements have been working well for you.
I have two remaining change requests before approving:
(1) I think that private_unit() should also be called at the end of matchPrepareLoop()
(2) private_uninit would be a better name than private_unit
@mitjap if u can also please update CHANGES.md under v1.0.0 and add one line with the content of this PR |
@simogasp Should I merge this PR and follow up with the additional fixes? Can't push to the original branch. |
If by "pushing to original branch" you mean my branch I think I enabled pushing for popsift maintainers. I can make code changes as requested but at the moment I don't have the time to test it properly. |
@mitjap Thanks, I'll give it another try tomorrow. I had cloned your repo and tried to push the uninit in match and it didn't work, but I may have made some other mistake. |
I made the requested changes to the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fixes!
Description
This pull request enables user to select on which GPU algorithm should run and makes it possible to run on multiple GPUs.
Features list
Implementation remarks
For use of multiple GPUs this implementation requires multiple
PopSift
instances. Main issue was that algorithm uses global state withextern
variables. I made those thread_local which enables each thread to have its own value for specific device.I have not tested matching (
ProcessingMode == MatchingMode
).