Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel build issue in 1.7.0 #30

Closed
QuLogic opened this issue Oct 23, 2014 · 5 comments
Closed

Parallel build issue in 1.7.0 #30

QuLogic opened this issue Oct 23, 2014 · 5 comments

Comments

@QuLogic
Copy link
Contributor

QuLogic commented Oct 23, 2014

I'm building 1.7.0 using configure (not CMake), and parallel builds (make -j8) will sporadically fail with the following error:

make[3]: Entering directory `/home/elliott/code/adios-1.7.0/tests/genarray'
rm -f gwrite_genarray.fh gread_genarray.fh
../../utils/gpp/gpp.py ./genarray3d.xml
mpif90 -DHAVE_CONFIG_H -I. -I../..  -I../../src    -g -O2 -c -o genarray2D-genarray2D.o `test -f 'genarray2D.F90' || echo './'`genarray2D.F90
rm -f gwrite_genarray.fh gread_genarray.fh
mpif90 -DHAVE_CONFIG_H -I. -I../..  -I../../src    -g -O2 -c -o copyarray2D-copyarray2D.o `test -f 'copyarray2D.F90' || echo './'`copyarray2D.F90
../../utils/gpp/gpp.py ./genarray3d.xml
test "." = "." || cp ./genarray3d.xml ./genarray.xml .
mpif90 -DHAVE_CONFIG_H -I. -I../..  -I../../src    -g -O2 -c -o genarray-genarray.o `test -f 'genarray.F90' || echo './'`genarray.F90
mpif90 -DHAVE_CONFIG_H -I. -I../..  -I../../src    -g -O2 -c -o copyarray-copyarray.o `test -f 'copyarray.F90' || echo './'`copyarray.F90
copyarray.F90:209.18:

    cache_end_time = MPI_WTIME()
                  1
Error: Symbol 'cache_end_time' at (1) has no IMPLICIT type
copyarray.F90:200.20:

    cache_start_time = MPI_WTIME()
                    1
Error: Symbol 'cache_start_time' at (1) has no IMPLICIT type
copyarray.F90:210.20:

    cache_total_time = cache_end_time - cache_start_time
                    1
Error: Symbol 'cache_total_time' at (1) has no IMPLICIT type
copyarray.F90:239.18:

    cache_end_time = MPI_WTIME()
                  1
Error: Symbol 'cache_end_time' at (1) has no IMPLICIT type
copyarray.F90:233.20:

    cache_start_time = MPI_WTIME()
                    1
Error: Symbol 'cache_start_time' at (1) has no IMPLICIT type
copyarray.F90:240.20:

    cache_total_time = cache_end_time - cache_start_time
                    1
Error: Symbol 'cache_total_time' at (1) has no IMPLICIT type
make[3]: *** [copyarray-copyarray.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory `/home/elliott/code/adios-1.7.0/tests/genarray'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/elliott/code/adios-1.7.0/tests'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/elliott/code/adios-1.7.0'
make: *** [all] Error 2

A subsequent build with serial make completes fine. This is usually a clear sign that some dependency is missing from the Makefile.

@pnorbert
Copy link
Contributor

So far no one was able to figure out how to define dependencies for
Fortran90 source files with modules in them so that parallel make would
work. We would be glad to learn about a solution.
On Oct 22, 2014 11:14 PM, "Elliott Sales de Andrade" <
notifications@github.com> wrote:

I'm building 1.7.0 using configure (not CMake), and parallel builds (make
-j8) will sporadically fail with the following error:

make[3]: Entering directory /home/elliott/code/adios-1.7.0/tests/genarray' rm -f gwrite_genarray.fh gread_genarray.fh ../../utils/gpp/gpp.py ./genarray3d.xml mpif90 -DHAVE_CONFIG_H -I. -I../.. -I../../src -g -O2 -c -o genarray2D-genarray2D.otest -f 'genarray2D.F90' || echo './'genarray2D.F90 rm -f gwrite_genarray.fh gread_genarray.fh mpif90 -DHAVE_CONFIG_H -I. -I../.. -I../../src -g -O2 -c -o copyarray2D-copyarray2D.otest -f 'copyarray2D.F90' || echo './'copyarray2D.F90 ../../utils/gpp/gpp.py ./genarray3d.xml test "." = "." || cp ./genarray3d.xml ./genarray.xml . mpif90 -DHAVE_CONFIG_H -I. -I../.. -I../../src -g -O2 -c -o genarray-genarray.otest -f 'genarray.F90' || echo './'genarray.F90 mpif90 -DHAVE_CONFIG_H -I. -I../.. -I../../src -g -O2 -c -o copyarray-copyarray.otest -f 'copyarray.F90' || echo './'`copyarray.F90
copyarray.F90:209.18:

cache_end_time = MPI_WTIME()
              1

Error: Symbol 'cache_end_time' at (1) has no IMPLICIT type
copyarray.F90:200.20:

cache_start_time = MPI_WTIME()
                1

Error: Symbol 'cache_start_time' at (1) has no IMPLICIT type
copyarray.F90:210.20:

cache_total_time = cache_end_time - cache_start_time
                1

Error: Symbol 'cache_total_time' at (1) has no IMPLICIT type
copyarray.F90:239.18:

cache_end_time = MPI_WTIME()
              1

Error: Symbol 'cache_end_time' at (1) has no IMPLICIT type
copyarray.F90:233.20:

cache_start_time = MPI_WTIME()
                1

Error: Symbol 'cache_start_time' at (1) has no IMPLICIT type
copyarray.F90:240.20:

cache_total_time = cache_end_time - cache_start_time
                1

Error: Symbol 'cache_total_time' at (1) has no IMPLICIT type
make[3]: *** [copyarray-copyarray.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory /home/elliott/code/adios-1.7.0/tests/genarray' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory/home/elliott/code/adios-1.7.0/tests'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/elliott/code/adios-1.7.0'
make: *** [all] Error 2

A subsequent build with serial make completes fine. This is usually a
clear sign that some dependency is missing from the Makefile.


Reply to this email directly or view it on GitHub
#30.

@QuLogic
Copy link
Contributor Author

QuLogic commented Oct 23, 2014

The general method is to add dependencies of one file on another object file. It doesn't work if you add a dependency on a module file, because Make doesn't really know about them and the situations under which compilers change or update them are somewhat inconsistent.

We do this quite a bit in specfem3d_globe (though without automake) and it works quite fine.

@QuLogic
Copy link
Contributor Author

QuLogic commented Oct 23, 2014

Actually, I looked a little closer at the files in the tests/genarray directory. I have to wonder if the problem has anything to do with dependencies now. It looks like copyarray.F90 and genarray.F90 both contain the definition for the same module, but with different contents. This is pretty fragile since many compilers save cached versions of the module in files named after the module itself. Since it doesn't appear that these files are even compiled into the same executable, I don't think they need to have the same module names there...

@QuLogic
Copy link
Contributor Author

QuLogic commented Oct 23, 2014

Looks like the name was fixed in master at ac9b543. For me, this seems to be sufficient, though you may be aware of other cases (where I just don't have a parallel-enough compile to trigger it).

However, I'll just point out that the program name is also the same (makes no difference for compile, I think; just a consistency thing) and the modules need to be added to CLEANFILES in Makefile.am

@QuLogic
Copy link
Contributor Author

QuLogic commented Oct 23, 2014

Sigh, of course, after posting that I ran into the adios_*_mod.mod issues...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants