Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: is there any method to avoid mapping same page? #258

Closed
hongbilu opened this issue May 16, 2023 · 6 comments
Closed

question: is there any method to avoid mapping same page? #258

hongbilu opened this issue May 16, 2023 · 6 comments

Comments

@hongbilu
Copy link

hi, there
i saw there's test case named "basic_small_buffers_mapping". If cudamalloc many times(far more than twice), is there any method to check if this memory's page has been already mapped? if it has been mapped by others, maybe we should use va from matched handle, the map API should not return failure?

@pakmarkthub
Copy link
Collaborator

Hi @hongbilu,

We don't provide such API. And the agreement of the pin and map API is within the buffer you pin and map. It is unsafe to assume that you can use the same CPU VA range from mapping CUDA buffer A to access CUDA buffer B.

@hongbilu
Copy link
Author

Hi @hongbilu,

We don't provide such API. And the agreement of the pin and map API is within the buffer you pin and map. It is unsafe to assume that you can use the same CPU VA range from mapping CUDA buffer A to access CUDA buffer B.

yes, but cudaMalloc cannot guarantee that memory address must be different page. In fact they might be same pretty much when allocating small data size which is a very common usage. The problem is that applications will take the management of all the cuda memory and check if cpu va is at same range with others, that is an additional work, too dirty and specific solution. what do you think?

@pakmarkthub
Copy link
Collaborator

Let's say that you have two CUDA buffers A and B from cudaMalloc. You will be able to pin both A and B, but you may not be able to map them. gdr_map requires the start address (does not have to be at the beginning of the buffer) to be GPU BAR1 page aligned. cudaMalloc does not guarantee the alignment. If you want to use GDRCopy to create CPU VA of your buffers, you must manually adjust the alignment (see https://github.com/NVIDIA/gdrcopy/blob/master/tests/common.cpp#L46). Generally, this results in you allocating each buffer with size larger than GPU BAR1 page. Thus, the buffers should not be that small anyway.

basic_small_buffers_mapping is a unit test to ensure that we can do gdr_pin of two small contiguous buffers. But even if you can pin, you cannot map as the second buffer is not GPU BAR1 page aligned. It is probably not what you are looking for.

@hongbilu
Copy link
Author

thanks! so it need to allocate more buffers manually which means a not easily to use for clients

@pakmarkthub
Copy link
Collaborator

You may use CUDA VMM instead of cudaMalloc. VMM always guarantees that CUDA VA is page aligned.

@hongbilu
Copy link
Author

very appreciate for remind! thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants