Allow scheduler options import from configuration file#405
Conversation
lesteve
left a comment
There was a problem hiding this comment.
Thanks a lot for the PR! A few quick remarks:
- you could add an empty entry in
jobqueue.yaml. I guess addingscheduler_options:would be enough. It is a bit annoying for people updating dask-jobqueue because it may break their config. Because of this I would be in favour of doing.getforscheduler_optionsand live with the inconsistency ... - tests for this would be very nice to have because it's easy to get it wrong. I believe we already some similar tests look for
dask.config.setin the tests
| # scheduler_options overrides parameters common to both workers and scheduler | ||
| scheduler_options = dict(default_scheduler_options, **scheduler_options) | ||
|
|
||
| config_scheduler_options = dask.config.get("jobqueue.%s" % config_name).get('scheduler_options') or {} |
There was a problem hiding this comment.
I am not 100% sure but I would do something like this:
if scheduler_options is None:
scheduler_options = dask.config.get("jobqueue.%s" % config_name, {})
scheduler_options = dict(default_scheduler_options, **scheduler_options)What I am aiming at here is that the scheduler_options override completely the config scheduler_options rather than trying to combine both and have some keys set from the config and some key set from the parameter.
To me it feels simpler and less error-prone this way.
There was a problem hiding this comment.
What I am aiming at here is that the scheduler_options override completely the config scheduler_options rather than trying to combine both and have some keys set from the config and some key set from the parameter.
Thanks for putting that up! Actually, I had implemented what I did there very intentionally, because I like to have things flexible. However, I just had a look on how Dask jobqueue handles this for other, i.e. the (variable length) list-type parameters. No merge happening there, so I have decided to implement your suggestion to keep this consistent.
This reverts commit ab5571e.
|
Thanks also for the helpful entry points on testing! I have come up with two scenarios that might be useful: One is checking that parameters provided in the config are actually used (to specify the scheduler options), the other checks if parameters manually passed completely overwrite those from the config. (In the earlier version I also had tests that checked for the default behaviour, as well as passing the Overall, I am not entirely sure if my
Good idea, that would at least indicate that a per-file configuration of the |
lesteve
left a comment
There was a problem hiding this comment.
code + test looks good as a first glance, thanks!
I am wondering 2 things
- whether we should put an empty
scheduler_options:section for all the schedulers injobqueue.yaml. Up for debate. - whether there should be a test to read from a file with a non-trivial
scheduler_optionsin the config file. This is to make sure this works. I think it should but having a test would be better because I am not 100% confident
I think, there is more pro than contra here, so I have implemented it now. (In the config file, I've also took the freedom to move the
I have started to implemented this, but then I thought that this is actually a topic of general relevance. If I see it correctly, all the config file options are currently tested only by manually defining a dictionary and adding it to the global |
…options-from-config-file
|
Thanks for repairing the CI, seems to have worked now! |
|
@kathoef sorry I am quite busy with other things, right now ... can you remote the "Draft" status of this PR (not sure how to do this I have to admit). I started using github before this Draft status existed, what is the advantage out of curiosity ? I have always seen things like WIP in title that worked well enough ... I'll try to review this PR before next Monday, don't hesitate to ping me if I don't do that ! |
Thanks for taking the time to leave a quick note! Yes, I can remove the draft status. (I have a "ready to review" button just above the "All checks have passed" part, maybe only I can see this?)
That's a very good question. I am relatively new to collaborative code work and to me this seemed like the clean way to go (when I opened this PR, I was sure that the code wasn't ready to be merged). Maybe it's actually a formal mechanism to ensure an agreement between developers and reviewers about when to merge? If a reviewer can't change the PR status, then this would be my interpretation. (In addition to that, unfortunately, I have absolutely no idea about how Github interprets this WIP title syntax. That's probably also why I have chosen the draft PR feature, it felt less like "doing it wrong".)
No urgency, and yes I will, thank you! |
lesteve
left a comment
There was a problem hiding this comment.
Looks good, I have a few small comments.
BTW I forgot to review your PR before Monday as I promised, you should have pinged me ;-)
| ) as cluster: | ||
| scheduler_options = cluster.scheduler_spec["options"] | ||
| assert scheduler_options.get("interface") == pass_scheduler_interface | ||
| assert scheduler_options.get("port") is None |
There was a problem hiding this comment.
Maybe add a comment before this line to explain that if scheduler_options is passed as an argument it completely replaces the scheduler_options config value rather than overriding the keys.
There was a problem hiding this comment.
I think it could be useful to add a comment about this. But shouldn't this be added also for the list-type entries/arguments that get fully overwritten? As the tests mostly don't have any comments, I am also wondering if is this a good / the best place to document this behaviour. (Maybe one could also move these two test cases into separate functions and give them more meaningful names? But, in my impression, this would break with the current style.)
|
|
||
| pass_scheduler_interface = list(net_if_addrs.keys())[1] | ||
|
|
||
| generic_cluster_conf = { |
There was a problem hiding this comment.
Do you need generic_cluster_conf i.e. can you not pass a smaller dict without these keys set to their default value?
There was a problem hiding this comment.
Thanks for questioning this approach, I now came up with a shorter implementation. I guess, though, the drawback is the very "hacky" retrieval of the cluster's config_name. Any idea on how to improve that?
By thinking about where to retrieve the config_name variable, I also stumpled across an inconsistency for the cluster implementation's config_name variables. As it would be good practice to do so, I have opened a separate PR on it. (Should probably also have done that for moving these extra: [] entries in the configuration...)
There was a problem hiding this comment.
Thanks for questioning this approach, I now came up with a shorter implementation. I guess, though, the drawback is the very "hacky" retrieval of the cluster's config_name. Any idea on how to improve that?
Well, I have just solved that (stupid) "hacky" part... still open for suggestions on the present approach, though. ;)
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Yes, sorry... I had some medical intervention on Tuesday last week, and I guess this made several things slip off my table already on Monday... thanks for taking care. |
|
Looks like the CI also failed because I was too sloppy... I have now restored the empty dictionary that allows the I originally planned to move the tests into As the CI now has passed, could you have another look, @lesteve? |
| if scheduler_options is None: | ||
| scheduler_options = {} | ||
| scheduler_options = dask.config.get( | ||
| "jobqueue.%s.scheduler-options" % config_name, {} |
There was a problem hiding this comment.
Can you explain me why the {} is needed ? I would have thought that since scheduler-options is in jobqueue.yaml it should not be needed but the CI was failing without the {} so I must be missing something ...
There was a problem hiding this comment.
The "problem" is that the configuration dictionaries used for testing are all set up explicitely, and thus won't inherit from jobqueue.yaml, see here for an example. (Copying that approach is actually how I came up with this generic_cluster_conf dictionary.)
Instead of having
dask.config.get("jobqueue.%s.scheduler-options" % config_name, {})
it would also be possible to add scheduler-options: {} for each of these configuration dictionaries used for testing of the specific clusters.
However, then we would actually require the existence of a scheduler-options entry for standalone configurations, which would break already existing user configurations.
There was a problem hiding this comment.
However, then we would actually require the existence of a scheduler-options entry for standalone configurations, which would break already existing user configurations.
OK fair enough, for not breaking backward-compatibility, let's keep the .get
|
Sorry for taking such a long time to merge this one :-S. I don't have that much time to look at Dask-Jobqueue these days unfortunately ... FYI I pushed a few tweaks:
|
|
When will this patch be in a release? |
|
Not sure maybe in one or two weeks if I get the chance to do a release by then, but there is no guarantee about this unfortunately. In the meantime two work-arounds :
|
@lesteve, what are the pending tasks/PRs that need to be addressed prior the next release? I'm happy to help out |
Thanks a lot for your help! I had a quick look at the open issues and I think I would need to figure out what to do about #196. I am leaning towards providing a work-around at the moment rather than changing the dask-jobqueue code so this would not be a blocker. Feel free to have a quick look at the list of issues to see if I have missed something! |
This is a possible implementation of #400. For a start, I have decided to go for as little extra code as possible and have added a
config_scheduler_optionsvariable that is set by a cluster-specific nestedscheduler_optionsentry (as was suggested by @lesteve above).However, in terms of consistency/design choices, I am not entirely sure about the
scheduler_optionsentry being fully optional. The advantage is probably, that everybody who doesn't need this feature won't have to extend their already existing user-specific configuration files. The disadvantage is certainly, that configurable options are now inconsistently handled. If I see it correctly, every parameter that could be set from a configuration file is currently required to exist as an entry and I think it would be user-friendly to follow this logic also for the scheduler options (which means, however, extending jobqueue.yaml by even more redundant code lines, which I am a bit hesitant to do).A general way out could be to implement configurable options not as currently done by
which throws a key error and terminates execution if an entry does not exist, but by
which returns
Noneif an entry is not existing. If I haven’t overseen anything, the overall behaviour of the code would not change, because a required configuration entrycores: nullis also simply translated intocores = None. Implementing configurable options like this would save lots of code lines and allow to set up very minimalistic configuration files.On the other hand, it would certainly get less clear what is actually configurable. Maybe keeping it inconsistent, but properly documenting it, is the way to go… or keeping it minimalistic and properly documenting the available options?
Either ways, input is greatly appreciated...