Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudev: fix 1D error introduced in PR 3378 #3418

Merged
merged 1 commit into from Jan 13, 2023

Conversation

cudawarped
Copy link
Contributor

@cudawarped cudawarped commented Jan 10, 2023

This fixes issue #3412 returning incorrect results for single GpuMat's with a single row, which was a result of an error in #3378.

Unfortunatley because GpuMat's with a single row or column are allocated with cudaMalloc and not cudaMallocPitch and 2D texture objects require pitched memory this PR introduces extra staging memory, when the row/col dimension is 1. This is a quick fix for the issue which only introduces extra overhead for single dimension GpuMat's who's use should not be that common.

A "better" fix would be to rework the TexturePtr class so that

__device__ __forceinline__ R operator ()(index_type y, index_type x) const;

staticly dispatches a call to tex1Dfetch when either x or y is 1, but this will be a more significant change.

Dependant on opencv/opencv#23126

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom
buildworker:Custom=linux-1
docker_image:Custom=ubuntu-cuda:16.04

@cudawarped cudawarped marked this pull request as draft January 11, 2023 07:14
@cudawarped cudawarped marked this pull request as ready for review January 11, 2023 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA Remap gives incorrect result or crashes for 1 row/column src Mats
3 participants