Core: variadic templates for backend. #559

GPMueller · 2020-03-28T09:38:43Z

The assign and reduce parallelisation functions can now handle arbitrary numbers of function arguments, which are forwarded to the lambda.
Therefore, the lambda is now placed before the variadic arguments.
Note that they can now also be called without arguments, allowing e.g. an assignment of zero to an entire field.

This PR is related to issue #529.

TODO:

create variadic lambdas in backend and update usage
replace Vectormath and Manifoldmath with backend functions to get rid of the cuda files

The assign and reduce parallelisation functions can now handle arbitrary numbers of function arguments, which are forwarded to the lambda. Therefore, the lambda is now placed before the variadic arguments. Note that they can now also be called without arguments, allowing e.g. an assignment of zero to an entire `field`.

coveralls · 2020-03-28T13:27:27Z

Coverage remained the same at 79.423% when pulling 61da747 on feature-variadic-lambda into 55cc299 on develop.

We cannot capture variadic parameters in the CUDA lambdas, which we would need in the variadic reduce. The API now includes an integer size argument again, since it otherwise would not work if the parameter pack was empty (and it is needed in the CUDA version of reduce).

codecov · 2020-04-02T08:04:44Z

Codecov Report

Merging #559 into develop will increase coverage by 0.21%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #559      +/-   ##
===========================================
+ Coverage    50.06%   50.27%   +0.21%     
===========================================
  Files           88       88              
  Lines        10136    10183      +47     
===========================================
+ Hits          5075     5120      +45     
- Misses        5061     5063       +2

- replaced all usage of `fill` - replaced all usage of `normalize_vectors` - tidied up the Solvers by using `Backend::par::apply`. With this change, particularly RK4 and Heun should become faster in OpenMP and CUDA

GPMueller · 2020-04-21T15:16:32Z

Unfortunately, the extended lambdas in CUDA are more restricted than I thought, see https://docs.nvidia.com/cuda/cuda-c-programming-guide/#extended-lambda-restrictions

The main issue is this rule:

If the enclosing function is a class member, then the following conditions must be satisfied:

All classes enclosing the member function must have a name.

The member function must not have private or protected access within its parent class.

All enclosing classes must not have private or protected access within their respective parent classes.

This means that, for example, most Method_GNEB functions and members would need to be public or it cannot really use Backend::par::apply etc.

Note also

__host__ __device__ extended lambdas cannot be generic lambdas.

...

...

An extended lambda has the following restrictions on captured variables:

...

A variable can only be captured by value.

A variable of array type cannot be captured if the number of array dimensions is greater than 7.

...

A function parameter that is an element of a variadic argument pack cannot be captured.

...

Init-capture is not supported for __host__ __device__ extended lambdas. Init-capture is supported for __device__ extended lambdas, except when the init-capture is of array type or of type std::initializer_list.

The function call operator for an extended lambda is not constexpr. The closure type for an extended lambda is not a literal type. The constexpr specifier cannot be used in the declaration of an extended lambda.

GPMueller requested a review from MSallermann March 28, 2020 09:38

GPMueller force-pushed the feature-variadic-lambda branch from 2830b1f to 0ad8c31 Compare March 28, 2020 12:53

GPMueller added 2 commits April 6, 2020 19:11

Core: moving from Vectormath to Backend::par.

fcb3efa

- replaced all usage of `fill` - replaced all usage of `normalize_vectors` - tidied up the Solvers by using `Backend::par::apply`. With this change, particularly RK4 and Heun should become faster in OpenMP and CUDA

Core: fixed recent changes to Heun and SIB.

8ed4aef

GPMueller force-pushed the feature-variadic-lambda branch from 8cd8065 to 3228c39 Compare April 20, 2020 21:48

Core: fixes for CUDA usage of Backend_par.

61da747

GPMueller force-pushed the feature-variadic-lambda branch from 3228c39 to 61da747 Compare April 20, 2020 22:02

GPMueller linked an issue May 25, 2020 that may be closed by this pull request

Core: improve CUDA code with C++11 features #529

Open

GPMueller force-pushed the develop branch 2 times, most recently from 212372f to f49c114 Compare April 25, 2022 22:01

GPMueller added core refactoring labels Feb 13, 2023

muellan force-pushed the develop branch 9 times, most recently from 63fbd74 to 2edff73 Compare June 7, 2023 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: variadic templates for backend. #559

Core: variadic templates for backend. #559

GPMueller commented Mar 28, 2020 •

edited

Loading

coveralls commented Mar 28, 2020 •

edited

Loading

codecov bot commented Apr 2, 2020 •

edited

Loading

GPMueller commented Apr 21, 2020 •

edited

Loading

Core: variadic templates for backend. #559

Are you sure you want to change the base?

Core: variadic templates for backend. #559

Conversation

GPMueller commented Mar 28, 2020 • edited Loading

coveralls commented Mar 28, 2020 • edited Loading

codecov bot commented Apr 2, 2020 • edited Loading

Codecov Report

GPMueller commented Apr 21, 2020 • edited Loading

GPMueller commented Mar 28, 2020 •

edited

Loading

coveralls commented Mar 28, 2020 •

edited

Loading

codecov bot commented Apr 2, 2020 •

edited

Loading

GPMueller commented Apr 21, 2020 •

edited

Loading