Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] image size limitation? #88

Closed
tic80 opened this issue Apr 24, 2020 · 5 comments · Fixed by #89
Closed

[question] image size limitation? #88

tic80 opened this issue Apr 24, 2020 · 5 comments · Fixed by #89
Milestone

Comments

@tic80
Copy link

tic80 commented Apr 24, 2020

When running popsift-demo.exe on an 20Mpixels PGM image (5456x3632), I ran into an out of memory error. The code is compiled on window10 with VC++2019, I use CUDA 10.2 and a GeForce RTX 2080 with 8 GB of RAM.
Do you have a limitation on the size of the image ?
Thanks

popsift-demo.exe --print-dev-info  -i 2.pgm
PopSift version: 1.0.0
2.pgm
Choosing device 0: GeForce RTX 2080 with Max-Q Design
Device information:
    Name: GeForce RTX 2080 with Max-Q Design
    Compute Capability:    7.5
    Total device mem:      8589934592 B 8388608 kB 8192 MB
    Per-block shared mem:  49152
    Warp size:             32
    Max threads per block: 1024
    Max threads per SM(X): 1024
    Max block sizes:       {1024,1024,64}
    Max grid sizes:        {2147483647,65535,65535}
    Number of SM(x)s:      46
    Concurrent kernels:    yes
    Mapping host memory:   yes
    Unified addressing:    yes

sift_octave.cu:286
    Could not allocate Intermediate layered array: out of memory
@griwodz
Copy link
Member

griwodz commented Apr 29, 2020

The reason is that CUDA Textures have strict size limits. Until someone adds an alternative code path, the best we can do it giving a better and earlier warning. I propose PR#89 for that.

@tic80
Copy link
Author

tic80 commented Apr 30, 2020

I was looking to the code and it seems that you upscale the image per default by a factor of 2.
If I choose a factor of 1, I have enough memory to extract the features.
Is there a big impact on accuracy by not upscaling the input image ?

The out of memory error comes from allocating an 3d array (cudaMalloc3DArray) of
width x upscale x height x upscale x (levels - 1)

Would it be possible to rather use (levels - 1) x 2d array instead ?
In this case, there should be enough memory in most common graphic card.

@simogasp simogasp linked a pull request Apr 30, 2020 that will close this issue
@simogasp simogasp added this to the v1.0.0 milestone Apr 30, 2020
@griwodz
Copy link
Member

griwodz commented Apr 30, 2020

The upscaling by 2 is the default of the original SIFT paper.
From my understanding of the algorithm, you would lose the highest quality features because you must take the first image, blur it 4 times, then compute the 3 Difference-of-Gaussian layers, and then you can search for feature point candidates in the middle one of those 3. So ... quite a bit of more blurry than the original image. Upscaling fixes that problem.

The RTX 2080 has a quite large amount of memory, but it limits the maximum Surface2DLayered size to 65536 (bytes) × 32768 × 2048. Are you sure that memory allocation fails in your case and not the creation of the surface?

It is absolutely possible to use very much less memory at the price of slightly slower code. It was just never a goal for me. Several of the currently supported alternative downscaling approaches would not be supported in that case and feature points would have to be collected differently, but it could be done.

Unfortunately, I'm not able to do it in the foreseeable future.

@fabiencastan
Copy link
Member

You can adjust the SIFT params in popsift::Config.
See setDownsampling:

void setDownsampling( float v );

@tic80
Copy link
Author

tic80 commented Apr 30, 2020

@fabiencastan config.setDownsampling(0); solves my issue, Thanks

@griwodz thanks for the explanation, it does make sense.
the first allocation that failed is cudaMalloc3DArray in Octave::alloc_interm_array()
the maximum Surface2dLayered supported by my graphic card is 32768 x 32768 x 2048

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants