Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate USE_ACCEL=opencl #683

Open
hfp opened this issue Jun 29, 2023 · 0 comments
Open

Evaluate USE_ACCEL=opencl #683

hfp opened this issue Jun 29, 2023 · 0 comments

Comments

@hfp
Copy link
Member

hfp commented Jun 29, 2023

Evaluate USE_ACCEL=opencl and ideally share some feedback. There are tuned parameters for the following GPUs: P100, V100, A100-40GB, A100-80GB, H100, and PVC. For practically all GPU vendors, OpenCL is simply part of the "native" or preferred GPU runtime installation, e.g., installing CUDA installs Nvidia's OpenCL runtime as well. The OpenCL backend in DBCSR does not bail-out for kernels without tuned parameters and it carries tuned defaults for common GPUs, i.e., tuned parameters are not exactly necessary.

Standalone DBCSR has equal support for CUDA and OpenCL except OpenCL not falling back to larger GPU-supported GEMMs. For CP2K, OpenCL can be used as well up to the DBCSR support. However, CP2K can use DBCSR with OpenCL and CUDA otherwise (tested on Nvidia platforms). Otherwise means GRID, DBM, DBT, FFT, and CUDA-enabled dependencies like ELPA or COSMA. For the latter, SYCL or OpenMP support for GPUs may be available as well.

For Nvidia based platforms (not HIP), some HPC deployments are set to "exclusive mode" (see nvidia-smi) means that OpenCL-enabled applications cannot be used with multiple ranks per GPU. This can be lifted easily but requires a setup to either change or allow user option to toggle the compute mode.

The outcome of an evaluation can be ideally used to guide future development or contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant