Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clashes between default options of different schedulers #61

Open
TomNicholas opened this issue Nov 12, 2018 · 1 comment
Open

Clashes between default options of different schedulers #61

TomNicholas opened this issue Nov 12, 2018 · 1 comment

Comments

@TomNicholas
Copy link
Contributor

The initialisation of the SchedulerOptions class sets some default options which turn out to be incompatible with the SLURM scheduler. For example N means number of nodes in Slurm, but experi currently thinks it always means job name.

This is an example of a more general problem: job schedulers just have valid and invalid options, so defaulting to the option another scheduler (or the shell) uses doesn't make sense. If SLURMOptions doesn't recognise a command then it should either throw an error immediately or include it without question as an arbitrary key. What it should not do is default to the option used by another scheduler, because that's probably wrong too, and the end result is an error which is harder to debug.

This makes me wonder if inheritance (of SLURMOptions from SchedulerOptions) is actually the right paradigm to be using here? It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour. Or just do away with defaults completely and make people specify every batch option they want in the .yml file via the arbitrary keys method... (I actually think that's a good idea for reproducibility purposes too)

Also for reference this is the set of commands used by the machine I'm trying to run my jobs on, and here is a helpful "Rosetta Stone for job schedulers" to compare against.

@malramsay64
Copy link
Owner

Thanks for all the issues you have raised in relation to SLURM. It will take me a little while to work through and consider all of them. I don't actually have a production SLURM system to test on so I really appreciate you testing this functionality out.

Is this just the clash of the N? Possibly also o. Yes this is definitely a bug

I definitely agree that the SchedulerOptions class and subclasses needs some refactoring. I thought there was going to be far more overlap between the different schedulers when I started writing that code.

It would be better to have some structure which allows new users to easily plug in the options for their particular scheduler, and not have to worry about whether they have correctly overwritten all the default behaviour.

This is exactly what I would like. It would probably be possible with an abstract base class, yet that doesn't help that the classes are rather clunky.

Or just do away with defaults completely...

One of design goals of the project is that there are sensible defaults. I realise I should document this, along with the other design goals somewhere #67. An alternative idea to this I have had is to use system specific configuration options #52, where the general options like name, are specified generally, then something like Project which is specific to a system specified separately.

Again thanks for raising all the issues/pull requests, I am getting to them all.

malramsay64 added a commit that referenced this issue Nov 19, 2018
These options are different between schedulers and create clashes (see
issue #61).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants