Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch4/ucx: MPI_Win_allocate causing NWChem problems again #6110

Closed
jeffhammond opened this issue Aug 11, 2022 · 9 comments
Closed

ch4/ucx: MPI_Win_allocate causing NWChem problems again #6110

jeffhammond opened this issue Aug 11, 2022 · 9 comments

Comments

@jeffhammond
Copy link
Member

See nwchemgit/nwchem#633 (comment) for the initial report. This is very similar to what I saw that led to bad898f.

I don't know when I'll have time to bisect this. Can somebody look at RMA and see if there were any nontrivial changes to shared-memory atomics recently?

@jeffhammond
Copy link
Member Author

It was working as of v3.4.1 but is broken in 4.x branches currently tested by Debian: nwchemgit/nwchem#633 (comment).

@hzhou hzhou changed the title MPI_Win_allocate causing NWChem problems again ch4/ucx: MPI_Win_allocate causing NWChem problems again Aug 12, 2022
@hzhou
Copy link
Contributor

hzhou commented Aug 12, 2022

nwchemgit/nwchem#633 (comment) mentions this is a regression between v4.0.0 and v4.0.1 --

$ gitlog v4.0.1..v4.0.2
64a07944d5 03/29 17:13 Merge pull request #5916 from raffenet/4.0.2-version
43055d9d0c 03/29 09:56 Update version info for 4.0.2
7762795d55 03/29 09:41 Merge pull request #5868 from raffenet/4.0.x-fixes
99914087a5 03/25 12:53 modules: update yaksa
e85aae41c3 03/28 14:27 mtest: Add GPU flags to libmtest_cxx.la library
f172287e21 03/10 18:33 mpl: fix level zero properties initialization
dbcd7dda57 03/18 15:06 init: add explicit MPIR_pmi_barrier in init
338c3be330 03/07 11:56 romio_synchronized_sync: move barrier after fsync
b1ada30eab 03/15 22:24 ch4/posix: workaround for inter-process mutex on FreeBSD
a880832132 03/01 16:22 f08: add elemental to eq/neq operator functions
a9c1d2e693 02/08 17:52 test: permanently xfail collective length checks
f70e811780 02/16 14:58 mtest: Encapsulate GPU dependencies
ec8d02ce92 11/08 15:45 mtest: Convert utilities to libtool convenience libraries
dd93a149e8 02/15 13:19 ch3: Use MPIR_pmi_spawn_multiple
265273c9c3 02/16 10:31 mpir: Add pmi2 support in MPIR_pmi_spawn_multiple
9980487ef0 02/15 17:58 ch3: Define max jobid length in a single place
bc87c98e8e 02/15 17:57 ch3/sock: Remove unused header include
27605f9cbf 02/15 14:04 pm/hydra: Fix appnum retrieval in pmi2 server
d06952dc89 02/15 13:18 pmi2/simple: Fix PMI2_Nameserv_lookup
f67095200a 02/22 16:19 ch4: Workaround Intel compiler TLS bug on macOS
010c98527c 01/19 17:36 op: replace predefined datatype on input
a1f6564fd1 01/23 01:49 mpl: Use CPU affinity functions with standard names

The likely offending commit is "b1ada30eab 03/15 22:24 ch4/posix: workaround for inter-process mutex on FreeBSD" from #5894

EDIT: Hmm, that commit only affects FreeBSD and shouldn't affect ucx build. I am at a loss.

@hzhou
Copy link
Contributor

hzhou commented Aug 12, 2022

@jeffhammond I suspect this is potentially an issue with armci-mpi. Does it assume MPI_Win_allocate allocates shared memory on intra-node? ch4:ucx does not have shm enabled.

@jeffhammond
Copy link
Member Author

It's not ARMCI-MPI. The code has not changed significantly in the past 8 years, and has been tested literally thousands of times with NWChem against every RMA implementation out there, usually in tortuous single-node circumstances. ARMCI-MPI works fine with many other versions of MPICH. The bug is in MPICH:Ch4:UCX or UCX.

@jeffhammond
Copy link
Member Author

jeffhammond commented Aug 13, 2022

ARMCI-MPI does not assume MWA allocates shared-memory, because there is a no standard-compliant way to do that. I proposed a solution to that for the MPI standard.

@jeffhammond
Copy link
Member Author

This, and the corresponding difference with free are the only differences in ARMCI-MPI when MWA is used. None of the communication code depends on this in any way.

  if (ARMCII_GLOBAL_STATE.use_win_allocate == 0) {

      if (local_size == 0) {
        alloc_slices[alloc_me].base = NULL;
      } else {
        MPI_Alloc_mem(local_size, alloc_shm_info, &(alloc_slices[alloc_me].base));
        ARMCII_Assert(alloc_slices[alloc_me].base != NULL);
      }
      MPI_Win_create(alloc_slices[alloc_me].base, (MPI_Aint) local_size, 1, MPI_INFO_NULL, group->comm, &mreg->window);
  }
  else if (ARMCII_GLOBAL_STATE.use_win_allocate == 1) {

      /* give hint to CASPER to avoid extra work for lock permission */
      if (alloc_shm_info == MPI_INFO_NULL)
          MPI_Info_create(&alloc_shm_info);
      MPI_Info_set(alloc_shm_info, "epochs_used", "lockall");

      MPI_Win_allocate( (MPI_Aint) local_size, 1, alloc_shm_info, group->comm, &(alloc_slices[alloc_me].base), &mreg->window);

      if (local_size == 0) {
        /* TODO: Is this necessary?  Is it a good idea anymore? */
        alloc_slices[alloc_me].base = NULL;
      } else {
        ARMCII_Assert(alloc_slices[alloc_me].base != NULL);
      }
  }

@raffenet
Copy link
Contributor

raffenet commented Sep 8, 2022

Just to confirm @jeffhammond, I believe #6140 fixed this issue?

@jeffhammond
Copy link
Member Author

That's what I've been told by Edo, who is authoritative on all NWChem issues.

@raffenet
Copy link
Contributor

raffenet commented Sep 8, 2022

That's what I've been told by Edo, who is authoritative on all NWChem issues.

👍

@raffenet raffenet closed this as completed Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants