Can't compile with CUDA and MEMORY\_INSTALLED\_PER\_CORE\_IN\_GB=6.d0 #186

luet · 2014-07-22T15:52:24Z

Problem description

I get a compile error when compiling the CUDA version of specfem3d_globe and using MEMORY_INSTALLED_PER_CORE_IN_GB > 4.

When setting MEMORY_INSTALLED_PER_CORE_IN_GB=6.d0 The error I get is:

 size of static arrays per slice =    4504.6434200000003       MB
                                 =    4295.9627342224121       MiB
                                 =    4.5046434199999998       GB
                                 =    4.1952761076390743       GiB

    (should be below 80% or 90% of the memory installed per core)
    (if significantly more, the job will not run by lack of memory)
    (note that if significantly less, you waste a significant amount
     of memory per processor core)
    (but that can be perfectly acceptable if you can afford it and
     want faster results by using more cores)

 size of static arrays for all slices =    108.11144208000000       GB
                                      =    100.68662658333778       GiB
                                      =   0.10811144207999999       TB
                                      =   9.83267837727908045E-002  TiB

 *******************************************************************************
 Estimating optimal disk dumping interval for UNDO_ATTENUATION:
 *******************************************************************************

STOP you are using more memory than what you told us is installed!!! there is an error

make: *** [OUTPUT_FILES/values_from_mesher.h] Error 1
make: *** Waiting for unfinished jobs....

I need more than 4 GB per core because NEX_XI=256. If I use NEX_XI=128, I can compile and run without problems.

I have posted two Par_file's on an external web site (http://geoweb3.princeton.edu/~luet/) since I don't think we can upload files on GitHub. Those files are:

Par_file_GPU_movie (NEX_XI=128)
Par_file_GPU_syn (NEX_XI=256)

If I set MEMORY_INSTALLED_PER_CORE_IN_GB_=10.d0, the compilation goes further but fails at link time: see link_error.txt.

Configure and compile

I configure with:

   configure FC=gfortran CC=gcc MPIFC=mpif90 MPICC=mpicc --with-cuda=cuda5

The problem occurs both with GNU and Intel compilers.

I use cuda version 5.5.22.

The text was updated successfully, but these errors were encountered:

komatits · 2014-07-22T17:13:50Z

Hi David,

MEMORY_INSTALLED_PER_CORE_IN_GB is unrelated to CUDA, it is used by
UNDO_ATTENUATION. Just set UNDO_ATTENUATION = .false. in DATA/Par_file
(and update to the latest version of "devel", in which UNDO_ATTENUATION
was improved last week).

Thanks,
Dimitri.

On 22/07/2014 17:52, David Luet wrote:

Problem description

I get a compile error when compiling the CUDA version of specfem3d_globe
and using MEMORY_INSTALLED_PER_CORE_IN_GB > 4.

When setting MEMORY_INSTALLED_PER_CORE_IN_GB=6.d0 The error I get is:

| size of static arrays per slice = 4504.6434200000003 MB
= 4295.9627342224121 MiB
= 4.5046434199999998 GB
= 4.1952761076390743 GiB
 (should be below 80% or 90% of the memory installed per core)
 (if significantly more, the job will not run by lack of memory)
 (note that if significantly less, you waste a significant amount
  of memory per processor core)
 (but that can be perfectly acceptable if you can afford it and
  want faster results by using more cores)
size of static arrays for all slices = 108.11144208000000 GB
= 100.68662658333778 GiB
= 0.10811144207999999 TB
= 9.83267837727908045E-002 TiB

Estimating optimal disk dumping interval for UNDO_ATTENUATION:

STOP you are using more memory than what you told us is installed!!! there is an error

make: *** [OUTPUT_FILES/values_from_mesher.h] Error 1
make: *** Waiting for unfinished jobs....
|

I need more than 4 GB per core because NEX_XI=256. If I use NEX_XI=128,
I can compile and run without problems.

I have posted two Par_file's on an external web site
(http://geoweb3.princeton.edu/~luet/
http://geoweb3.princeton.edu/%7Eluet/) since I don't think we can
upload files on GitHub. Those files are:

Par_file_GPU_movie
http://geoweb3.princeton.edu/%7Eluet/Par_file_GPU_movie (NEX_XI=128)

Par_file_GPU_syn
http://geoweb3.princeton.edu/%7Eluet/Par_file_GPU_syn (NEX_XI=256)

If I set MEMORY_INSTALLED_PER_CORE_IN_GB_=10.d0, the compilation goes
further but fails at link time: see link_error.txt
http://geoweb3.princeton.edu/%7Eluet/link_error.txt.

Configure and compile

I configure with:

| configure FC=gfortran CC=gcc MPIFC=mpif90 MPICC=mpicc --with-cuda=cuda5
|

The problem occurs both with GNU and Intel compilers.

I use cuda version 5.5.22.

—
Reply to this email directly or view it on GitHub
#186.

Dimitri Komatitsch
CNRS Research Director (DR CNRS), Laboratory of Mechanics and Acoustics,
UPR 7051, Marseille, France http://komatitsch.free.fr

QuLogic · 2014-07-22T19:50:03Z

For the link error, since you have large static arrays, you will need to add -mcmodel=medium to both FCFLAGS and CFLAGS. Also -shared-intel if using Intel compilers. Alternatively, use more processors or a lower resolution.

luet · 2014-07-22T20:27:20Z

@komatits you are right. I was not using the latest devel version. I went passed this error.

luet · 2014-07-22T20:32:17Z

@QuLogic this is the fix indeed. But I think there might be a problem with the configure script.
If I do

./configure FC=ifort CC=icc MPIFC=mpif90 MPICC=mpicc CFLAGS="-mcmodel=large -shared-intel"  FCFLAGS="-mcmodel=large -shared-intel"  --with-opencl

the make operation fails. The problem is that ifort doesn't use the -mcmodel=large -shared-intel option.

But if I try to trick it by doing:

./configure FC="ifort -mcmodel=large -shared-intel"  CC=icc MPIFC=mpif90 MPICC=mpicc CFLAGS="-mcmodel=large -shared-intel"  --with-opencl

it works.
Am I doing something wrong?
Thanks,
David

QuLogic · 2014-07-22T20:36:42Z

You are correct, for some reason FCFLAGS is commented out. I do not know why; maybe @komatits knows?

PS, you should add -O2 (or similar) for the CFLAGS. They are not automatically specified; maybe we should change that.

luet · 2014-07-22T20:49:21Z

Good point about -O2. I would agree that this could be set by default.

komatits · 2014-07-30T23:06:48Z

Not a bug apparently (?).

komatits · 2014-07-31T14:23:31Z

Fixed by David @luet by uncommenting FCFLAGS

luet added the bug label Jul 22, 2014

komatits closed this as completed Jul 30, 2014

luet mentioned this issue Jul 31, 2014

FCFLAGS commented out in Makefile.in SPECFEM/specfem3d#204

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't compile with CUDA and MEMORY\_INSTALLED\_PER\_CORE\_IN\_GB=6.d0 #186

Can't compile with CUDA and MEMORY\_INSTALLED\_PER\_CORE\_IN\_GB=6.d0 #186

luet commented Jul 22, 2014

komatits commented Jul 22, 2014

QuLogic commented Jul 22, 2014

luet commented Jul 22, 2014

luet commented Jul 22, 2014

QuLogic commented Jul 22, 2014

luet commented Jul 22, 2014

komatits commented Jul 30, 2014

komatits commented Jul 31, 2014

Can't compile with CUDA and MEMORY\_INSTALLED\_PER\_CORE\_IN\_GB=6.d0 #186

Can't compile with CUDA and MEMORY\_INSTALLED\_PER\_CORE\_IN\_GB=6.d0 #186

Comments

luet commented Jul 22, 2014

Problem description

Configure and compile

komatits commented Jul 22, 2014

QuLogic commented Jul 22, 2014

luet commented Jul 22, 2014

luet commented Jul 22, 2014

QuLogic commented Jul 22, 2014

luet commented Jul 22, 2014

komatits commented Jul 30, 2014

komatits commented Jul 31, 2014