Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce code duplication in CUDA memcpy and memset implementations #981

Conversation

BenjaminW3
Copy link
Member

Since they are now using the same underlying CUDA API calls, they can reuse
the implementation and add the final wait depending on the queue type.

Result: -270 lines of code

Since they are now using the same underlying CUDA API calls, they can reuse
the implementation and add the final wait depending on the queue type.
@BenjaminW3 BenjaminW3 force-pushed the topic-reduce-cuda-mem-op-code-duplication branch from bb3a434 to 11ddb14 Compare April 21, 2020 18:47
@psychocoderHPC psychocoderHPC added this to the Version 0.5.0 milestone Apr 22, 2020
@psychocoderHPC
Copy link
Member

@BenjaminW3 Please set always the milestone. This will help to create the changelog or later use the PIConGPU tool to auto generate it.

@BenjaminW3
Copy link
Member Author

@psychocoderHPC Could you please approve the changes? The Mac build is flaky.

@psychocoderHPC
Copy link
Member

Yes, I am still review it.

@BenjaminW3
Copy link
Member Author

The changeset looks huge but I only moved the code to common implementation base classes (they were already there for copy but not for set).

@psychocoderHPC psychocoderHPC merged commit a54aef4 into alpaka-group:develop Apr 22, 2020
@BenjaminW3 BenjaminW3 deleted the topic-reduce-cuda-mem-op-code-duplication branch April 22, 2020 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants