Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while building for OpenACC support with pgf90 21.3 #5

Closed
echoi opened this issue May 20, 2021 · 3 comments
Closed

Error while building for OpenACC support with pgf90 21.3 #5

echoi opened this issue May 20, 2021 · 3 comments

Comments

@echoi
Copy link
Contributor

echoi commented May 20, 2021

Hi, I'm reporting an issue I ran into while trying to build geoflac with OpenAcc support.
I got pgf90 as a part of NVIDIA HPC SDK 21.3.
A working recipe for building geoflac with OpenACC would be much appreciated!

(base) echoi@ptah:~/opt/geoflac/src$ make
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c myrandom_mod.f90
myrandom:
      9, Generating acc routine seq
         Generating Tesla code
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c params.f90
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c arrays.f90
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c phases.f90
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c marker_data.f90
allocate_markers:
     26, Generating update device(max_markers)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 72)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 72)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 85)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 85)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$p (marker_data.f90: 87)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$sd (marker_data.f90: 87)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$p (marker_data.f90: 88)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$sd (marker_data.f90: 88)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$p (marker_data.f90: 89)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$sd (marker_data.f90: 89)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$p (marker_data.f90: 90)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$sd (marker_data.f90: 90)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$p (marker_data.f90: 91)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$sd (marker_data.f90: 91)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$p (marker_data.f90: 92)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$sd (marker_data.f90: 92)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$p (marker_data.f90: 93)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$sd (marker_data.f90: 93)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$p (marker_data.f90: 94)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$sd (marker_data.f90: 94)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 95)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 95)
add_marker:
     41, Generating acc routine seq
         Generating Tesla code
  0 inform,   0 warnings,  22 severes, 0 fatal for add_marker
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 129)
newphase2marker:
    115, Loop unrolled 2 times
    121, Memory zero idiom, loop replaced by call to __c_mzero8
  0 inform,   0 warnings,   1 severes, 0 fatal for newphase2marker
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 140)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 140)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$p (marker_data.f90: 163)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$sd (marker_data.f90: 163)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 144)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 144)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 145)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 145)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$p (marker_data.f90: 150)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 150)
count_phase_ratio:
    132, Generating acc routine seq
         Generating Tesla code
    142, Memory zero idiom, loop replaced by call to __c_mzero4
    143, Loop not vectorized: data dependency
    149, Generated vector simd code for the loop
  0 inform,   0 warnings,  10 severes, 0 fatal for count_phase_ratio
make: *** [Makefile:216: marker_data.o] Error 2

(base) echoi@ptah:~/opt/geoflac/src$ pgf90 --version

pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell
PGI Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
@tan2
Copy link
Owner

tan2 commented May 20, 2021 via email

@echoi
Copy link
Contributor Author

echoi commented May 20, 2021

The original error was fixed with an option, -ta=tesla:cc60,managed.
However, another error occurred during compilation of bc_updated.f90:

pgf90 -g -acc=gpu -Mcuda -Minfo=accel -ta=tesla:cc60,managed -O2 -Minfo=all -c bc_update.f90
bc_update:
     18, Generating implicit copyout(force(:,:,:)) [if not already present]
     19, Loop is parallelizable
         Generating Tesla code
         19,   ! blockidx%x threadidx%x auto-collapsed
             !$acc loop gang, vector(128) collapse(3) ! blockidx%x threadidx%x
     23, Accelerator serial kernel generated
         Generating Tesla code
         25, !$acc do seq
         57, !$acc do seq
     23, Generating implicit copyin(force(:,:,1:2)) [if not already present]
         Generating implicit copyout(force(:,nx,1:2)) [if not already present]
         Generating implicit copyin(iphase(:,:)) [if not already present]
         Generating implicit copy(j) [if not already present]
         Generating implicit copyin(cord(:,:,1:2)) [if not already present]
     26, Accelerator restriction: induction variable live-out from loop: j
     28, Accelerator restriction: induction variable live-out from loop: j
     29, Accelerator restriction: induction variable live-out from loop: j
     35, Accelerator restriction: induction variable live-out from loop: j
     36, Accelerator restriction: induction variable live-out from loop: j
     40, Accelerator restriction: induction variable live-out from loop: j
     41, Accelerator restriction: induction variable live-out from loop: j
     45, Accelerator restriction: induction variable live-out from loop: j
     46, Accelerator restriction: induction variable live-out from loop: j
     49, Accelerator restriction: induction variable live-out from loop: j
     58, Accelerator restriction: induction variable live-out from loop: j
     60, Accelerator restriction: induction variable live-out from loop: j
     61, Accelerator restriction: induction variable live-out from loop: j
     68, Accelerator restriction: induction variable live-out from loop: j
     69, Accelerator restriction: induction variable live-out from loop: j
     74, Accelerator restriction: induction variable live-out from loop: j
     75, Accelerator restriction: induction variable live-out from loop: j
     78, Accelerator restriction: induction variable live-out from loop: j
     79, Accelerator restriction: induction variable live-out from loop: j
     82, Accelerator restriction: induction variable live-out from loop: j
     90, Generating implicit copyin(cord(:,:,1:2),nopbou(1:nopbmax,1:4)) [if not already present]
         Generating implicit copy(force(:,:,:)) [if not already present]
         Generating implicit copyin(bcstress(1:nopbmax,1:2)) [if not already present]
     91, Complex loop carried dependence of force prevents parallelization
         Loop carried dependence due to exposed use of force(:,:,:) prevents parallelization
         Accelerator serial kernel generated
         Generating Tesla code
         91, !$acc loop seq
nvvmCompileProgram error 9: NVVM_ERROR_COMPILATION.
Error: /tmp/pgaccA26soObjQ4KE.gpu (1041, 4): parse multiple definition of local value named 'li1167_tca0'
ptxas /tmp/pgacc626sU5dvkTRA.ptx, line 1; fatal   : Missing .version directive at start of file '/tmp/pgacc626sU5dvkTRA.ptx'
ptxas fatal   : Ptx assembly aborted due to errors
NVFORTRAN-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (bc_update.f90: 90)
NVFORTRAN/x86-64 Linux 21.3-0: compilation aborted
make: *** [Makefile:219: bc_update.o] Error 2

According to https://forums.developer.nvidia.com/t/nv-21-3-fails-to-compile-my-openacc-code/176435, NVIDIA HPC SDK 21.3 has some problems including this one. The solution suggested in the nvidia forum did fix the error, which is to add -Mx,231,0x01.

The building is now completed on my machine (see below for specs) but a segmentation fault occurs when the executable is run with examples/subduction.inp.

$ nvidia-smi
Thu May 20 09:53:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01    Driver Version: 460.73.01    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:04:00.0  On |                  N/A |
|  0%   38C    P8     9W / 215W |     78MiB /  8111MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

$ pgf90 -V

pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell
PGI Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.

@echoi
Copy link
Contributor Author

echoi commented May 21, 2021

The issue has been resolved.
NVIDIA SDK 21.2 was tried and found to work. For my GTX-1080 GPU, I had only to change cc70 to cc60 in the compile and linking options.

Although this is a separate issue, I report here that the OpenMP version built with pgf90 crashes during remeshing on my machine. When built with gfortran, however, the OpenMP version works fine.

@echoi echoi closed this as completed May 21, 2021
tan2 added a commit that referenced this issue Feb 10, 2023
* Add phase 17 for metamorphic sediment
* Rename phase 5 to schist
* Phase change sequence: ksed2 (#11 sediment) -> ksed1 (#10 sedi. rock)
  -> kmetased (#17, meta. sedi. rock) -> kschist (#5, schist)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants