Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor template order allocMappedBuf #2270

Merged

Conversation

SimeonEhrig
Copy link
Member

Move template TPlatform as the last template. There is no need to provide the platform template signature if we pass the platform as an instance.

follow up of #2162

psychocoderHPC
psychocoderHPC previously approved these changes May 16, 2024
@psychocoderHPC
Copy link
Member

psychocoderHPC commented May 16, 2024

please update

| cudaMallocHost | alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) 1D, 2D, 3D supported! |
too.
It was also forgotten during the change from platform type to platform instance.

Maybe we should use here allocMappedBufIfSupported()

@psychocoderHPC
Copy link
Member

psychocoderHPC commented May 16, 2024

please update

| cudaHostAlloc | alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) 1D, 2D, 3D supported! |
too

alpaka::allocMappedBuf<TElement>(host, platform, extents) 1D, 2D, 3D supported!  

@SimeonEhrig
Copy link
Member Author

please update

| cudaHostAlloc | alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) 1D, 2D, 3D supported! |

too

alpaka::allocMappedBuf<TElement>(host, platform, extents) 1D, 2D, 3D supported!  

I alpaka::allocMappedBuf<TElement, TIdx>(host, platform, extents) 1D, 2D, 3D supported! should be correct.

@@ -363,7 +363,7 @@ The following tables list the functions available in the `CUDA Runtime API <http
+----------------------------+--------------------------------------------------------------------------------------------+
| cudaGetSymbolSize | -- |
+----------------------------+--------------------------------------------------------------------------------------------+
| cudaHostAlloc | alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) 1D, 2D, 3D supported! |
| cudaHostAlloc | alpaka::allocMappedBuf<TElement, TIdx>(host, extents) 1D, 2D, 3D supported! |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry my comment was wrong this should be allocMappedBufIfSupported. There is no strong need that this memory is mapped.
allocMappedBuf is not available for any accelerator or device.

@@ -363,7 +363,7 @@ The following tables list the functions available in the `CUDA Runtime API <http
+----------------------------+--------------------------------------------------------------------------------------------+
| cudaGetSymbolSize | -- |
+----------------------------+--------------------------------------------------------------------------------------------+
| cudaHostAlloc | alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) 1D, 2D, 3D supported! |
| cudaHostAlloc | alpaka::allocMappedBufIfSupported<TElement, TIdx>(host, extents) 1D, 2D, 3D supported! |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too should be just

Suggested change
| cudaHostAlloc | alpaka::allocMappedBufIfSupported<TElement, TIdx>(host, extents) 1D, 2D, 3D supported! |
| cudaHostAlloc | alpaka::allocMappedBuf<TElement, TIdx>(host, extents) 1D, 2D, 3D supported! |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that allocMappedBuf is correct here. allocMappedBuf can fail if mapped memory is not supported for the current device, so the code will not run with all devices. allocMappedBufIfSupported will be safe for all kinds of devices.

The CUDA API function cudaHostAlloc is pinning memory but does not guarantee that the memory is visible on the GPU device, so the best fitting alpaka equivalent is allocMappedBufIfSupported

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CUDA API function cudaHostAlloc is pinning memory but does not guarantee that the memory is visible on the GPU device, so the best fitting alpaka equivalent is allocMappedBufIfSupported

It does guarantee that. From the CUDA Runtime API:

__host__ cudaError_t cudaHostAlloc ( void** pHost, size_t size, unsigned int flags )

Description

Allocates size bytes of host memory that is page-locked and accessible to the device.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, cudaMallocHost() and cudaHostAlloc() are identical, so they should map to the same functionality in alpaka.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I was not aware that cudaMallocHost is the C++-API equivalent of cudaHostAlloc. I used always cudaHostAlloc with flags to create mapped memory.

In this case, IMO both should be equal. The problem I had is that for both CUDA methods, you can configure the behavior via a flag where we have only named functions. Maybe we should add allocMappedBufIfSupported, allocMappedBuf, and allocMappedBuf into the documentation and the user must decide what fitts best for their problem because the method used for the replacement depends on the flags used in the native CUDA function call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it back to alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) and wrote a for foot note for alpaka::allocMappedBufIfSupported<TPlatform, TElement>(host, extents)

@@ -363,7 +363,7 @@ The following tables list the functions available in the `CUDA Runtime API <http
+----------------------------+--------------------------------------------------------------------------------------------+
| cudaGetSymbolSize | -- |
+----------------------------+--------------------------------------------------------------------------------------------+
| cudaHostAlloc | alpaka::allocMappedBuf<TPlatform, TElement>(host, extents) 1D, 2D, 3D supported! |
| cudaHostAlloc | alpaka::allocMappedBuf<TElement, TIdx>(host, extents) 1D, 2D, 3D supported! [1] |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| cudaHostAlloc | alpaka::allocMappedBuf<TElement, TIdx>(host, extents) 1D, 2D, 3D supported! [1] |
| cudaHostAlloc | alpaka::allocMappedBuf<TElement, TIdx>(host, platform, extents) 1D, 2D, 3D supported! [1] |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also below

Move template TPlatform as the last template. There is no need to provide the platform template signature if we pass the platform as an instance.
@psychocoderHPC psychocoderHPC merged commit 283e1b9 into alpaka-group:develop May 28, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants