Batching fast running processes in HPC jobs #4749

saulpierotti · 2024-02-16T13:49:03Z

New feature

It would be useful to have the possibility to set a batching parameter for grouping fast-running processes within the same HPC scheduler job. My idea would be to have a bunch of instances of the same process run sequentially within a single scheduler job, so to avoid overwhelming the scheduler with many jobs.

I sometimes find myself with what logically should be a single process to be too fast executing to merit allocating a job on the cluster, and if I do I end up with more than one million jobs, hitting limits on the number of concurrent jobs that the cluster admin allows. I could batch within the process execution in some custom way, but this requires ad-hoc code refactoring and it is not very elegant (i.e. a for loop running what should be a single process on a batch of inputs).

I don't know how easy this would be to implement, but was wondering weather anyone would have a similar need or has an idea if this is feasible.

robsyme · 2024-02-16T13:56:54Z

This is a sensible request. It sounds very similar to the task batching being worked on in PR #3909. That work has not been finalized or merged yet, but would the features described there solve your problem?

bentsherman · 2024-02-16T16:00:43Z

This will be addressed by #3909 , until then you can use this pattern which is essentially the custom solution you described. But I think we will try to merge the task batching PR sometime this year.

saulpierotti · 2024-02-17T15:13:58Z

Thanks both for the replies, yes this is exactly what I was looking for.

bentsherman linked a pull request Feb 16, 2024 that will close this issue

Task batching #3909

Draft

saulpierotti closed this as completed Feb 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batching fast running processes in HPC jobs #4749

Batching fast running processes in HPC jobs #4749

saulpierotti commented Feb 16, 2024

robsyme commented Feb 16, 2024

bentsherman commented Feb 16, 2024

saulpierotti commented Feb 17, 2024

Batching fast running processes in HPC jobs #4749

Batching fast running processes in HPC jobs #4749

Comments

saulpierotti commented Feb 16, 2024

New feature

robsyme commented Feb 16, 2024

bentsherman commented Feb 16, 2024

saulpierotti commented Feb 17, 2024