Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to add more srun command parameters in ecbuild_add_test #61

Closed
TingLei-NOAA opened this issue Apr 18, 2024 · 7 comments
Closed

how to add more srun command parameters in ecbuild_add_test #61

TingLei-NOAA opened this issue Apr 18, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@TingLei-NOAA
Copy link

TingLei-NOAA commented Apr 18, 2024

Is your feature request related to a problem? Please describe.

A user ctest is to be run with a fixed mpi task numbers (say, -n 36). however in another machine, this job has to be run on two nodes to avoid memory related failure.

Describe the solution you'd like

The easiest way for me is to add srun comand parameters as " -N 2 --ntasks-per-node=18" in ecbuild_add_test. But digging in ecbuild_add_test.cmake and doing "try and error" , I failed to find a way to do so with ecbuild_add_test

Describe alternatives you've considered

So, I am wondering if ecbuild experts could clarify on this. Did I miss any existing ecbuild functions to do this?
Thanks.

Additional context

No response

Organisation

LYNKER--EMC/NCEP/NOAA

@TingLei-NOAA TingLei-NOAA added the enhancement New feature or request label Apr 18, 2024
@wdeconinck
Copy link
Member

Probably the better approach is to run all the tests on a single allocation rather than have each test go on the queue.
With salloc (or sbatch) you could then make sure you have enough nodes and limit the number of tasks per node.

salloc -q <queue> -N 2 --ntasks-per-node=18 ctest

In case ctest now calls e.g. srun -n 36 hostname I can see that the output is 18 tasks each on their own node.

@TingLei-NOAA
Copy link
Author

TingLei-NOAA commented Apr 22, 2024

@wdeconinck Yes. Your suggested method does work! That could be used as a work-around, at least, for being now.
The problem is that this ctest could be run among other ctests which don't need those extra nodes. So, a test-specific control is desired. If to add processing for such extra options in the ecbuild is not feasible or not regarded as necessary, I can replace the current executable with a wrapping script in which various manipulation could be done.

@wdeconinck
Copy link
Member

ecbuild_add_test goes through this logic:
https://github.com/ecmwf/ecbuild/blob/develop/cmake/ecbuild_add_test.cmake#L419C1-L432C85
MPI_ARGS could be user-defined, and was intended for e.g. openmpi's "--oversubscribe" flag.

It would be up to you to set the to -N 2 --ntasks-per-node just for that test, and only in the case of slurm and make sure it is set to original value after ecbuild_add_test.

Note I did not test following, but conceptually:

if (MPI_ARGS)
  set(RESTORE_MPI_ARGS ${MPI_ARGS})
endif()
if ( <slurm detected> )
    set( MPI_ARGS "${MPI_ARGS} -N 2 --ntasks-per-node=18")
endif()
ecbuild_add_test( ... MPI 36 ... )
if( RESTORE_MPI_ARGS )
  set(MPI_ARGS ${RESTORE_MPI_ARGS})
endif()

@TingLei-NOAA
Copy link
Author

@wdeconinck , Did you mean I need to do changes (you suggested) to ecbuild_add_test to use MPI_ARGS ( and, yes, just to use MPA_ARGS as one argument didn't work in my tries )?

@wdeconinck
Copy link
Member

@TingLei-NOAA no I meant it like in the code snippet above, to set MPI_ARGS as a normal CMake variable. It gets picked up by ecbuild_add_test.
Its original intention is as a global variable via cmake configuration "-DMPI_ARGS=..." but then you're in the same undesired situation where it gets applied to each MPI test. That's why I suggested to do the "RESTORE_MPI_ARGS ... " dance.

@TingLei-NOAA
Copy link
Author

@wdeconinck Sure. Changing of MPI_ARG before and after the ecbuild_add_test seems a solution. I will try it and come back with an update here.
So many thanks!

@TingLei-NOAA
Copy link
Author

The changes suggested by @wdeconinck does work. Thank you @wdeconinck for your help!
I would close this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants