Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

homebrew-installed opencoarrays produces seg faults with simple coarry accesses #626

Open
rouson opened this issue Jan 14, 2019 · 2 comments

Comments

@rouson
Copy link
Member

rouson commented Jan 14, 2019

Avg response time
Issue Stats

Defect/Bug Report

  • OpenCoarrays Version: 2.3.1
  • Fortran Compiler: gfortran 8.2.0
  • C compiler used for building lib: gcc 8.2.0
  • Installation method: homebrew
  • Output of uname -a: Darwin localhost 18.2.0 Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 2018; root:xnu-4903.231.4~2/RELEASE_X86_64 x86_64
  • MPI library being used: OpenMPI
  • Machine architecture and number of physical cores: 4-core Intel Core i7
  • Version of CMake: 3.13.2

Observed Behavior

$ cat main.f90 
  type Array_Type
      real, allocatable :: values(:)
  end type
  type(Array_Type) array[*]

  allocate(array%values(2),source=0.)
  array%values = this_image()
  sync all
  print *, array%values
end
$ caf main.f90 
$ cafrun -n 4 ./a.out
   4.00000000       4.00000000    
   1.00000000       1.00000000    
   2.00000000       2.00000000    
   3.00000000       3.00000000    
[localhost:73816] *** An error occurred in MPI_Win_detach
[localhost:73816] *** reported by process [4040687617,1]
[localhost:73816] *** on win rdma window 5
[localhost:73816] *** MPI_ERR_OTHER: known error not in list
[localhost:73816] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[localhost:73816] ***    and potentially your MPI job)
[localhost:73814] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[localhost:73814] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Error: Command:
   `/usr/local/bin/mpiexec -n 4 ./a.out`
failed to run.
$ caf --version

OpenCoarrays Coarray Fortran Compiler Wrapper (caf version 2.3.1)
...

The error occurs intermittently (non-deterministically).

Installing using the OpenCoarrays installer eliminates the problem -- presumably because the installer installs MPICH instead of OpenMPI.

@zbeekman
Copy link
Collaborator

I'm seeing the following failures with OpenMPI and OC 2.5.0:

96% tests passed, 3 tests failed out of 78

Total Test time (real) =  44.76 sec

The following tests FAILED:
         14 - alloc_comp_get_convert_nums (Failed)
         23 - alloc_comp_send_convert_nums (Failed)
         69 - issue-515-mimic-mpi-gatherv (Failed)
Errors while running CTest

Hopefully it's the same problem we're seeing here... we'll see what happens with the "bottling" of the latest 2.5.0 release of OpenCoarrays.

@zbeekman
Copy link
Collaborator

@rouson I can't reproduce this with a fresh install of OpenCoarrays from Homebrew. I'm going to close this. If you have issues that are persisting, you can re-open or we can investigate together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants