Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: mimic gatherv #515

Closed
t-bltg opened this issue Mar 12, 2018 · 1 comment
Closed

Bug: mimic gatherv #515

t-bltg opened this issue Mar 12, 2018 · 1 comment

Comments

@t-bltg
Copy link
Contributor

t-bltg commented Mar 12, 2018

Avg response time
Issue Stats

Defect/Bug Report

  • OpenCoarrays Version: 2.0

  • Fortran Compiler: gfortran 7.3.0

  • C compiler used for building lib: gcc 7.3.0

  • Installation method: manual

  • Output of uname -a: Linux 4.13.0-36-generic #40-Ubuntu SMP Fri Feb 16 20:07:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • MPI library being used: mpich 3.2.1

  • Machine architecture and number of physical cores: 12

  • Version of CMake: 3.10.2

Observed Behavior

I was playing around trying to reproduce the behavior of a MPI_Gatherv, and the following example was failing in strange ways (see the comments), as if the data was shifted.

! caf -o gather gather.f90
! cafrun -np 2 ./gather
! this example mimics a mpi_gatherv with root=1, and variable size chunks
! gather into specified locations from all processes in a group

program gather
   implicit none

   type gvec ! a global vector
      real, allocatable :: a(:)
   end type

   type(gvec) :: gc[*]
   real, allocatable :: gv(:), tmp(:)
   integer :: me, nimg, gsize, i, lo, hi
   logical :: fail = .false., workaround = .false.

   me = this_image()
   nimg = num_images()

   allocate(gc % a(2 * me)) ! variable size data in container
   gc % a = [(me * i, i=1, 2 * me)] ! assignement
   ! img 1: [1, 2]
   ! img 2: [2, 4, 6, 8]

   print *, me, 'gc % a=', gc % a, ' size=', size(gc % a)

   ! collect the global vector size by summing local sizes
   gsize = size(gc % a)
   call co_sum(gsize, result_image=1)
   sync all

   if (me == 1) then
      if (gsize /= 6) error stop 1
      allocate(gv(gsize)) ! allocate a global vector of size 6 on img 1
      lo = 1
      do i = 1, nimg
         !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
         ! WORKAROUND, uncomment to pass this test
         if (workaround) then
            if (allocated(tmp)) deallocate(tmp)
            allocate(tmp(size(gc[i] % a))) ! ISSUE: remote access is done twice
         end if
         !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
         
         ! note: automatic reallocation of tmp
         ! bounds are correct, but data seems to be shifted by 1 (as if tmp was 0 based indexing)
         tmp = gc[i] % a
         ! print *, size(tmp) ! <== uncomment to pass the test !! WHAT ?
         ! print *, tmp, lbound(tmp), ubound(tmp), size(tmp)
         ! img 1: [1, 2] lb=1 ub=2
         ! img 2: [2, 4, 6, 8] lb=1 ub=4
         ! print *, tmp(lbound(tmp)), tmp(ubound(tmp)) ! <== segfault
         ! print *, tmp(1:size(tmp)) ! correct data, left shifted ???
         ! img1: [1, 2]
         ! img1: [2, 4, 6, 8]

         hi = lo + size(tmp) - 1
         if (workaround) then
            gv(lo:hi) = tmp
         else
            ! gv(lo:hi) = tmp(0:size(tmp)-1) ! seems OC is 'left shifting' the data, hence the 0 lb
            gv(lo:hi) = tmp
         end if
         lo = hi + 1 ! start of next chunk
      end do
      print *, 'gv=', gv, ' sum=', sum(gv)

      if (abs(sum(gv) - 23.) > epsilon(0.)) fail = .true.
   end if

   sync all

   ! CMake test output handler
   call co_broadcast(fail, source_image=1)
   if (fail) then
      write(*, *) 'Test failed!'
      error stop 5
   else
      write(*, *) 'Test passed.'
   end if

end program

Expected Behavior

gv= 1.0 2.0 2.0 4.0 6.0 8.0 sum= 23.0

Steps to Reproduce

caf -o gather gather.f90
cafrun -np 2 ./gather

gv= 2.0 0.0 4.0 6.0 8.0 0.0 sum= 20.0

Question

So I spent a few hours on the reallocation step implied by the tmp = gc[i] % a line, malloc calls are correct, bounds/stride are also correct, but the offset is never initialized when the realloc_required flag is true (which is the case here: tmp is never allocated explicitly).

So a possible patch could be something like
mpi_caf.c

// around line 4650
if (extent_mismatch)
  {
    dst->dim[dst_cur_dim].lower_bound = 1;
    dst->dim[dst_cur_dim]._ubound = delta;
    dst->dim[dst_cur_dim]._stride = size;
    if (realloc_required)
      dst->offset = -1;  // <== initialize offset to 0xffffffffffffffff
  }

With this patch the test passes as expected. Is this patch correct ?

@t-bltg t-bltg mentioned this issue Mar 12, 2018
4 tasks
@zbeekman
Copy link
Collaborator

@neok-m4700 That patch looks correct to me. If I'm mistaken, perhaps @vehre or @afanfa could weigh in. I've merged the PR with the patch so if/when either Andre or Alessandro confirm this is correct, let's close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants