Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround to bypass issue observed at very large scale with Fujitsu MPI #2874

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions Src/AmrCore/AMReX_TagBox.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -649,7 +649,24 @@ TagBoxArray::collate (Gpu::PinnedVector<IntVect>& TheGlobalCollateSpace) const
//
const IntVect* psend = (count > 0) ? TheLocalCollateSpace.data() : nullptr;
IntVect* precv = TheGlobalCollateSpace.data();

//Issues have been observed with the following call at very large scale when using
//FujitsuMPI. The issue seems to be related to the use of MPI_Datatype. We can
//bypasses the issue by exchanging simpler integer arrays.
#ifndef __FUJITSU
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We found today: the new -Nclang Fujtsu only defines __CLANG_FUJITSU

ParallelDescriptor::Gatherv(psend, count, precv, countvec, offset, IOProcNumber);
#else
const int* psend_int = psend->begin();
int* precv_int = precv->begin();
Long count_int = count * AMREX_SPACEDIM;
auto countvec_int = std::vector<int>(countvec.size());
auto offset_int = std::vector<int>(offset.size());
const auto mul_funct = [](const auto el){return el*AMREX_SPACEDIM;};
std::transform(countvec.begin(), countvec.end(), countvec_int.begin(), mul_funct);
std::transform(offset.begin(), offset.end(), offset_int.begin(), mul_funct);
ParallelDescriptor::Gatherv(
psend_int, count_int, precv_int, countvec_int, offset_int, IOProcNumber);
#endif

#else
TheGlobalCollateSpace = std::move(TheLocalCollateSpace);
Expand Down