Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The current code limits the maximum width of an image:
image width (in bytes) < cudaDerviceProp->maxSurface2DLayered[0]
For many GPUs, this is a very strict condition that limits the image to 4096 elements in width.
However the CUDA documentation define
maxSurface2DLayered
as maximum dimensions (i.e. number of elements).Maybe I'm wrong, but after some tests, I think I can confirm that
maxSurface2DLayered
is not in bytes.This PR change the way we check surface2DLayered dimensions.
We now use
maxSurface2DLayered[0]
as a maximum width limit in elements.This PR also change
Plane_2d
object methodsgetPitchInBytes()
andgetByteSize()
to returnsize_t
type instead ofshort
type to avoid overflow in the case of large images.Implementation remarks
Tested with CUDA 11.3 on a Quadro M4000 with multiple 7500x7500 images.