Python JobspecV1: Can I use flux batch rather than flux run? #5220

jan-janssen · 2023-05-30T16:10:17Z

For my density functional theory calculation, I typically have a python process running next to an MPI parallel Fortran code. The python process looks at the output file and interrupts the execution when the Fortran code is not converging. While ideally it would be great to handle all the convergence identification inside Fortran directly, the python wrapper offers rapid prototyping and material system specific tuning. When I use flux on the command line I would do something like this:

flux batch --flags=waitable -n 4 batch.sh

And then in the batch.sh script I would have:

python custodian.py &
flux run -n 4 dft_mpi_gpu

I can translate the flux run call to a JobspecV1 call in python, but I would prefer to move the flux batch call to the python level. In particular I like the concurrent.futures representation of the JobspecV1 class, which simplifies the integration in other python projects.

The text was updated successfully, but these errors were encountered:

grondo · 2023-05-30T18:41:32Z

Are you asking if there is a JobspecV1 constructor that creates the equivalent of the flux batch command? If so, I believe what you are looking for is from_batch_command().

Let me know if that is not what you need.

jan-janssen · 2023-05-31T11:28:28Z

Thanks a lot - from_batch_command() was exactly what I was looking for. Now I am just a little surprised, that the names change. How does num_slots compare to num_tasks? In my understanding they are identical but maybe I am missing something.

grondo · 2023-05-31T13:33:59Z

How does num_slots compare to num_tasks? In my understanding they are identical but maybe I am missing something.

As far as resource allocation goes, they are the same. The terminology is different for from_batch_command (and flux batch), because the result does not run any user tasks. It allocates resources, then runs 1 broker per node in the result, with the batch script (aka initial program) executing on broker rank 0.

They are called task slots because you're requesting to allocate slots of a given resource size (e.g. cores and gpus) for eventual placement of tasks.

jan-janssen · 2023-05-31T13:37:58Z

Ok, thanks again for the explanation.

jan-janssen · 2023-06-01T21:00:31Z

@grondo Maybe I do something wrong, but when I submit more than one flux run command inside the script which I set for from_batch_command() then somehow it seems to hang until it reached the last flux run command.

jan-janssen · 2023-06-01T21:07:20Z

My mistake - I forgot to close the with-statement.

jan-janssen closed this as completed May 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python JobspecV1: Can I use flux batch rather than flux run? #5220

Python JobspecV1: Can I use flux batch rather than flux run? #5220

jan-janssen commented May 30, 2023

grondo commented May 30, 2023

jan-janssen commented May 31, 2023

grondo commented May 31, 2023 •

edited

Loading

jan-janssen commented May 31, 2023

jan-janssen commented Jun 1, 2023

jan-janssen commented Jun 1, 2023

Python JobspecV1: Can I use flux batch rather than flux run? #5220

Python JobspecV1: Can I use flux batch rather than flux run? #5220

Comments

jan-janssen commented May 30, 2023

grondo commented May 30, 2023

jan-janssen commented May 31, 2023

grondo commented May 31, 2023 • edited Loading

jan-janssen commented May 31, 2023

jan-janssen commented Jun 1, 2023

jan-janssen commented Jun 1, 2023

grondo commented May 31, 2023 •

edited

Loading