You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently kernels are implemented twice, meaning that if we modify, e.g., momentumAndEnergyIAD.hpp, then we also need to modify cuda/cudaMomentumAndEnergyIAD.cu.
However the code does the same thing for every particle.
For every computeXXX function in sph-exa, we should have a:
namespace kernel{
inline void kernel::computeXXX(int pi, int *clist, ...)
}
function that takes the particle index as a parameter and only does the computation for that one particle. This function should only accept simple variables and raw pointers (by copy), and no references.
Basically, this function should usable both by OpenMP, OpenACC, and CUDA.
computeDensity(task) will handle data movement for OpenMP / OpenACC offloading / CUDA
computeDensity(particleArray) will handle omp / acc directives / CUDA kernel launch
kernel::computeDensity(int pi, int *clist) is identical for all models. Data movement and CUDA kernel launch are handled separately in computeDensity(task) and computeDensity(particleArray).
The easiest way to do this is probably by starting from the existing CUDA code, which is the most constrained.
The challenge is to compile the CUDA parts independently with nvcc. I am thinking of using a simple #include to import the kernel::computeXXX function. Code structure should look like:
include/sph/
density.hpp: contains computeDensity(taskList) as well as CPU implementations of computeDensity(task) and computeDensity(particleArray)
cuda/
density.cu: contains CUDA implementations of computeDensity(task) and computeDensity(particleArray)
kernel/
density.hpp: contains kernel::computeDensity(int pi, int *clist, ...)
kernel/density.hpp is included both in sph/density.hpp and sph/cuda/density.cu.
Of course, we want the same pattern for all computeXXX functions, not just density.
The text was updated successfully, but these errors were encountered:
Currently kernels are implemented twice, meaning that if we modify, e.g., momentumAndEnergyIAD.hpp, then we also need to modify cuda/cudaMomentumAndEnergyIAD.cu.
However the code does the same thing for every particle.
For every computeXXX function in sph-exa, we should have a:
function that takes the particle index as a parameter and only does the computation for that one particle. This function should only accept simple variables and raw pointers (by copy), and no references.
Basically, this function should usable both by OpenMP, OpenACC, and CUDA.
The workflow is something like this:
computeDensity(taskList)
-> calls computeDensity(task)
-> calls inline computeDensity(particleArray)
-> calls inline kernel::computeDensity(int pi, int *clist, ...)
computeDensity(task) will handle data movement for OpenMP / OpenACC offloading / CUDA
computeDensity(particleArray) will handle omp / acc directives / CUDA kernel launch
kernel::computeDensity(int pi, int *clist) is identical for all models. Data movement and CUDA kernel launch are handled separately in computeDensity(task) and computeDensity(particleArray).
The easiest way to do this is probably by starting from the existing CUDA code, which is the most constrained.
The challenge is to compile the CUDA parts independently with nvcc. I am thinking of using a simple #include to import the kernel::computeXXX function. Code structure should look like:
kernel/density.hpp is included both in sph/density.hpp and sph/cuda/density.cu.
Of course, we want the same pattern for all computeXXX functions, not just density.
The text was updated successfully, but these errors were encountered: