Skip to content

(invalid) define env_isDistributed()#764

Open
TysonRayJones wants to merge 1 commit into
cleanup-custom-mpifrom
failed-solution-of-mpi-with-non-distrib-quest
Open

(invalid) define env_isDistributed()#764
TysonRayJones wants to merge 1 commit into
cleanup-custom-mpifrom
failed-solution-of-mpi-with-non-distrib-quest

Conversation

@TysonRayJones
Copy link
Copy Markdown
Member

This was an attempt to avoid consulting comm_isInit() internally, which only indicates MPI is active, and not whether QuEST is allowed to use it! (QuEST itself may be launched non-distributed within a user-owned MPI environment). This was to relax the current restriction highlighted here.

This was already an ugly attempted solution because of the nuisance of not being able to expose internal functioj env_isDistributed() within the user-facing environment.h header...

It is, in fact a totally inadequate / non-functioning solution! This is due entirely to a single function; gpu_areAnyNodesBoundToSameGpu(). This is the only function in all of QuEST which needs to perform an MPI communication before the validation within initQuESTEnv has been completed (indeed, it's invoked by validation), in scenarios where the validation is not in the process of failing. This precludes us from calling env_isDistributed() within it, since the QuESTEnv is still nullptr, and the "in-process" initQuESTEnv call has nowhere to indicate that it intends to distribute QuEST. More info in the gpu_config.cpp diff.

Consistent with my typical luck, this function is defined in gpu_config.cpp, which was shown at the very bottom of my IDE's search results for files containing comm_isInit(). Ergo, it was the very last function to be updated in this refactor, maximally prolonging my ordeal.

I've made this PR anyway for reference while re-attempting a solution. Thankfully, a cleaner and more robust design exists (consulting of mpiCom == NULL from within comm_config.cpp)

This was an attempt to avoid consulting comm_isInit() internally, which only indicates MPI is active, and not whether QuEST is allowed to use it! (QuEST itself may be launched non-distributed).

It was already an ugly attempt because of the nuisance of not exposing  env_isDistributed in a header...

But it is in fact a totally inadequate solution due to a single function: gpu_areAnyNodesBoundToSameGpu()! This is the only function in all of QuEST which needs to perform an MPI communication before the validation within initQuESTEnv has been completed (indeed, it's invoked by validation), in scenarios where the validation is not currently failing. Grr!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant