Clashes between default options of different schedulers #61

TomNicholas · 2018-11-12T11:26:47Z

The initialisation of the SchedulerOptions class sets some default options which turn out to be incompatible with the SLURM scheduler. For example N means number of nodes in Slurm, but experi currently thinks it always means job name.

This is an example of a more general problem: job schedulers just have valid and invalid options, so defaulting to the option another scheduler (or the shell) uses doesn't make sense. If SLURMOptions doesn't recognise a command then it should either throw an error immediately or include it without question as an arbitrary key. What it should not do is default to the option used by another scheduler, because that's probably wrong too, and the end result is an error which is harder to debug.

This makes me wonder if inheritance (of SLURMOptions from SchedulerOptions) is actually the right paradigm to be using here? It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour. Or just do away with defaults completely and make people specify every batch option they want in the .yml file via the arbitrary keys method... (I actually think that's a good idea for reproducibility purposes too)

Also for reference this is the set of commands used by the machine I'm trying to run my jobs on, and here is a helpful "Rosetta Stone for job schedulers" to compare against.

The text was updated successfully, but these errors were encountered:

malramsay64 · 2018-11-13T03:49:46Z

Thanks for all the issues you have raised in relation to SLURM. It will take me a little while to work through and consider all of them. I don't actually have a production SLURM system to test on so I really appreciate you testing this functionality out.

Is this just the clash of the N? Possibly also o. Yes this is definitely a bug

I definitely agree that the SchedulerOptions class and subclasses needs some refactoring. I thought there was going to be far more overlap between the different schedulers when I started writing that code.

It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour.

This is exactly what I would like. It would probably be possible with an abstract base class, yet that doesn't help that the classes are rather clunky.

Or just do away with defaults completely...

One of design goals of the project is that there are sensible defaults. I realise I should document this, along with the other design goals somewhere #67. An alternative idea to this I have had is to use system specific configuration options #52, where the general options like name, are specified generally, then something like Project which is specific to a system specified separately.

Again thanks for raising all the issues/pull requests, I am getting to them all.

These options are different between schedulers and create clashes (see issue #61).

malramsay64 added a commit that referenced this issue Nov 19, 2018

BUG Stop parsing single letter scheduler options

fdb9281

These options are different between schedulers and create clashes (see issue #61).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clashes between default options of different schedulers #61

Clashes between default options of different schedulers #61

TomNicholas commented Nov 12, 2018

malramsay64 commented Nov 13, 2018

Clashes between default options of different schedulers #61

Clashes between default options of different schedulers #61

Comments

TomNicholas commented Nov 12, 2018

malramsay64 commented Nov 13, 2018