You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The initialisation of the SchedulerOptions class sets some default options which turn out to be incompatible with the SLURM scheduler. For example N means number of nodes in Slurm, but experi currently thinks it always means job name.
This is an example of a more general problem: job schedulers just have valid and invalid options, so defaulting to the option another scheduler (or the shell) uses doesn't make sense. If SLURMOptions doesn't recognise a command then it should either throw an error immediately or include it without question as an arbitrary key. What it should not do is default to the option used by another scheduler, because that's probably wrong too, and the end result is an error which is harder to debug.
This makes me wonder if inheritance (of SLURMOptions from SchedulerOptions) is actually the right paradigm to be using here? It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour. Or just do away with defaults completely and make people specify every batch option they want in the .yml file via the arbitrary keys method... (I actually think that's a good idea for reproducibility purposes too)
Thanks for all the issues you have raised in relation to SLURM. It will take me a little while to work through and consider all of them. I don't actually have a production SLURM system to test on so I really appreciate you testing this functionality out.
Is this just the clash of the N? Possibly also o. Yes this is definitely a bug
I definitely agree that the SchedulerOptions class and subclasses needs some refactoring. I thought there was going to be far more overlap between the different schedulers when I started writing that code.
It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour.
This is exactly what I would like. It would probably be possible with an abstract base class, yet that doesn't help that the classes are rather clunky.
Or just do away with defaults completely...
One of design goals of the project is that there are sensible defaults. I realise I should document this, along with the other design goals somewhere #67. An alternative idea to this I have had is to use system specific configuration options #52, where the general options like name, are specified generally, then something like Project which is specific to a system specified separately.
Again thanks for raising all the issues/pull requests, I am getting to them all.
The initialisation of the
SchedulerOptions
class sets some default options which turn out to be incompatible with the SLURM scheduler. For exampleN
means number of nodes in Slurm, but experi currently thinks it always means job name.This is an example of a more general problem: job schedulers just have valid and invalid options, so defaulting to the option another scheduler (or the shell) uses doesn't make sense. If
SLURMOptions
doesn't recognise a command then it should either throw an error immediately or include it without question as an arbitrary key. What it should not do is default to the option used by another scheduler, because that's probably wrong too, and the end result is an error which is harder to debug.This makes me wonder if inheritance (of
SLURMOptions
fromSchedulerOptions
) is actually the right paradigm to be using here? It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour. Or just do away with defaults completely and make people specify every batch option they want in the.yml
file via the arbitrary keys method... (I actually think that's a good idea for reproducibility purposes too)Also for reference this is the set of commands used by the machine I'm trying to run my jobs on, and here is a helpful "Rosetta Stone for job schedulers" to compare against.
The text was updated successfully, but these errors were encountered: