Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: variadic templates for backend. #559

Draft
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

GPMueller
Copy link
Member

@GPMueller GPMueller commented Mar 28, 2020

The assign and reduce parallelisation functions can now handle arbitrary numbers of function arguments, which are forwarded to the lambda.
Therefore, the lambda is now placed before the variadic arguments.
Note that they can now also be called without arguments, allowing e.g. an assignment of zero to an entire field.

This PR is related to issue #529.

TODO:

  • create variadic lambdas in backend and update usage
  • replace Vectormath and Manifoldmath with backend functions to get rid of the cuda files

The assign and reduce parallelisation functions can now handle arbitrary numbers of function arguments, which are forwarded to the lambda.
Therefore, the lambda is now placed before the variadic arguments.
Note that they can now also be called without arguments, allowing e.g. an assignment of zero to an entire `field`.
@coveralls
Copy link

coveralls commented Mar 28, 2020

Coverage Status

Coverage remained the same at 79.423% when pulling 61da747 on feature-variadic-lambda into 55cc299 on develop.

We cannot capture variadic parameters in the CUDA lambdas, which we would need in the variadic reduce.
The API now includes an integer size argument again, since it otherwise would not work if the parameter pack was empty (and it is needed in the CUDA version of reduce).
@codecov
Copy link

codecov bot commented Apr 2, 2020

Codecov Report

Merging #559 into develop will increase coverage by 0.21%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #559      +/-   ##
===========================================
+ Coverage    50.06%   50.27%   +0.21%     
===========================================
  Files           88       88              
  Lines        10136    10183      +47     
===========================================
+ Hits          5075     5120      +45     
- Misses        5061     5063       +2     

- replaced all usage of `fill`
- replaced all usage of `normalize_vectors`
- tidied up the Solvers by using `Backend::par::apply`. With this change, particularly RK4 and Heun should become faster in OpenMP and CUDA
@GPMueller
Copy link
Member Author

GPMueller commented Apr 21, 2020

Unfortunately, the extended lambdas in CUDA are more restricted than I thought, see https://docs.nvidia.com/cuda/cuda-c-programming-guide/#extended-lambda-restrictions

The main issue is this rule:

  1. If the enclosing function is a class member, then the following conditions must be satisfied:
  • All classes enclosing the member function must have a name.
  • The member function must not have private or protected access within its parent class.
  • All enclosing classes must not have private or protected access within their respective parent classes.

This means that, for example, most Method_GNEB functions and members would need to be public or it cannot really use Backend::par::apply etc.


Note also

  1. __host__ __device__ extended lambdas cannot be generic lambdas.
  2. ...
  3. ...
  4. An extended lambda has the following restrictions on captured variables:
    • ...
    • A variable can only be captured by value.
    • A variable of array type cannot be captured if the number of array dimensions is greater than 7.
    • ...
    • A function parameter that is an element of a variadic argument pack cannot be captured.
    • ...
    • Init-capture is not supported for __host__ __device__ extended lambdas. Init-capture is supported for __device__ extended lambdas, except when the init-capture is of array type or of type std::initializer_list.
    • The function call operator for an extended lambda is not constexpr. The closure type for an extended lambda is not a literal type. The constexpr specifier cannot be used in the declaration of an extended lambda.

@GPMueller GPMueller linked an issue May 25, 2020 that may be closed by this pull request
@GPMueller GPMueller force-pushed the develop branch 2 times, most recently from 212372f to f49c114 Compare April 25, 2022 22:01
@muellan muellan force-pushed the develop branch 9 times, most recently from 63fbd74 to 2edff73 Compare June 7, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Core: improve CUDA code with C++11 features
2 participants