Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about CUDABuffer_<T> #49

Open
lanfeiying opened this issue May 17, 2020 · 4 comments
Open

Question about CUDABuffer_<T> #49

lanfeiying opened this issue May 17, 2020 · 4 comments

Comments

@lanfeiying
Copy link

Hi,
I am new to CUDA programming and learning this open-source project.
Recently I was reading the source code of this project. I have a question about it. I will appreciate it if you can read this issue and answer my question.

In libvis/cuda/, the basic data structures are defined in the files under this directory.

For /applications/badslam/src/badslam/cuda_depth_processing.cu, the function ComputeNormalsCUDA has a argument CUDABuffer_<T>* out_normals. I can understand this step: the pointer of out_normals is passed into this function.

When this function launches the kernel function ComputeNormalsCUDAKernel(), it pass *out_normals as an object into the kernel function. However, the kernel function receives an object as input in its argument list, instead of a reference. My question is: how could the modifications on *out_normals in the kernel function affect the out_normals?

In other words, in the kernel function, it computes the normals and modifies the corresponding element of out_normals by using out_normals(y, x) = xxx, but it is not the reference of the object. It is a copy of the object.

Is it because when the object out_normals is passed into the kernel function, the copy of *out_normals copied its data head address, instead of allocating new memory space in GPU? So that when the copy of *out_normals use the operator out_normals(y, x), it modified the corresponding data with respect to the original data head pointer address?

Thank you very much for your time to read this question!

@puzzlepaint
Copy link
Collaborator

Is it because when the object out_normals is passed into the kernel function, the copy of *out_normals copied its data head address, instead of allocating new memory space in GPU? So that when the copy of *out_normals use the operator out_normals(y, x), it modified the corresponding data with respect to the original data head pointer address?

Yes, that is exactly how it works.

@puzzlepaint
Copy link
Collaborator

To elaborate a bit, there are two classes CUDABuffer and CUDABuffer_ that behave quite differently. The former one, CUDABuffer, is a bit higher-level and 'owns' the GPU buffer that it allocates. It is meant to be used from CPU code. For historical reasons (working with old CUDA versions), I could not pass such higher-level objects to CUDA code, since there had been incompatibilities. So I needed a light-weight helper class to access the same functionality from CUDA code. This is what CUDABuffer_ does. It is a simple wrapper over a few attributes of the buffer, and it can be retrieved with CUDABuffer::ToCUDA(). Copying the CUDABuffer_ object merely creates a copy of the pointer to the buffer, not a copy of the buffer itself.

@lanfeiying
Copy link
Author

Thank you very much for your answer!
I have another question about applications/badslam/src/badslam/cuda_depth_processing.cu.
For the CUDA kernel function ComputeMinMaxDepthCUDAKernel(), it is a template function. The template arguments are block_width and block_height.

In ComputeMinMaxDepthCUDA(), this function lauch the above CUDA kernel function.
However, I see it passes the template argument block_width and block_height without pre-definition.

I searched these two variables in whole project, and it only appears here. These two variables are not defined, how does c++ compiler recognize the two variables?

@puzzlepaint
Copy link
Collaborator

block_width and block_height are defined by the macro CUDA_AUTO_TUNE_2D_TEMPLATED(), which calls CUDA_AUTO_TUNE_2D_BORDER_TEMPLATED(), in libvis/src/libvis/cuda/cuda_auto_tuner.h.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants