Skip to content

Commit

Permalink
Use GPU preprocessor option to compile for GPU
Browse files Browse the repository at this point in the history
  • Loading branch information
shuds13 committed Jan 11, 2023
1 parent f2331fa commit bd74796
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,14 @@ mpicc -O3 -o forces.x forces.c -lm
# Need to toggle to OpenMP target directive in forces.c.

# xl
# xlc_r -O3 -qsmp=omp -qoffload -o forces.x forces.c
# xlc_r -WF,-DGPU -O3 -qsmp=omp -qoffload -o forces.x forces.c

# Nvidia (nvc) compiler with mpicc and on Cray system with target (Perlmutter)
# mpicc -O3 -fopenmp -mp=gpu -o forces.x forces.c
# cc -O3 -fopenmp -mp=gpu -target-accel=nvidia80 -o forces.x forces.c
# mpicc -DGPU -O3 -fopenmp -mp=gpu -o forces.x forces.c
# cc -DGPU -O3 -fopenmp -mp=gpu -target-accel=nvidia80 -o forces.x forces.c

# Spock/Crusher (AMD ROCm compiler)
# cc -I${ROCM_PATH}/include -L${ROCM_PATH}/lib -lamdhip64 -fopenmp -O3 -o forces.x forces.c
# cc -DGPU -I${ROCM_PATH}/include -L${ROCM_PATH}/lib -lamdhip64 -fopenmp -O3 -o forces.x forces.c

# Intel oneAPI (Clang based) Compiler (JIT compiled for device)
# mpiicx -O3 -fiopenmp -fopenmp-targets=spir64 -o forces.x forces.c
# mpiicx -DGPU -O3 -fiopenmp -fopenmp-targets=spir64 -o forces.x forces.c
18 changes: 10 additions & 8 deletions libensemble/tests/scaling_tests/forces/forces_app/forces.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@
Particle force arrays are allreduced across ranks.
Sept 2019:
Added OpenMP options for CPU and GPU. Toggle in forces_naive function.
Added OpenMP options for CPU and GPU.
Jan 2022:
Use GPU preprocessor option to compile for GPU (e.g. -DGPU).
Run executable on N procs:
Expand Down Expand Up @@ -138,18 +141,17 @@ double forces_naive(int n, int lower, int upper, particle* parr) {
double ret = 0.0;
double dx, dy, dz, r, force;

#ifdef GPU
// For GPU/Accelerators
/*
#pragma omp target teams distribute parallel for \
map(to: lower, upper, n) map(tofrom: parr[0:n]) \
reduction(+: ret) //thread_limit(128) //*/

// For CPU
//*
reduction(+: ret)
#else
// Use OpenMP threads on CPU
#pragma omp parallel for default(none) shared(n, lower, upper, parr) \
private(i, j, dx, dy, dz, r, force) \
reduction(+:ret) //*/

reduction(+:ret)
#endif
for(i=lower; i<upper; i++) {
for(j=0; j<n; j++){

Expand Down

0 comments on commit bd74796

Please sign in to comment.