Add shebang option to support more shells#189
Conversation
Previously the shell(shebang) was hardcoded to be '#!/bin/bash'. This fixes that, because some of the queueing schedulers had their own shebangs hardcoded as well, so you actually ended up having two shebang lines. In order to accomodate more shells, we can add an option to specify shell via shebangs. If you think this implementation is fine, I can/will add this to the documentation as well.
| env_extra = dask.config.get('jobqueue.%s.env-extra' % self.scheduler_name) | ||
| if log_directory is None: | ||
| log_directory = dask.config.get('jobqueue.%s.log-directory' % self.scheduler_name) | ||
| if shebang is None: |
There was a problem hiding this comment.
I'd reconsider adding a kwarg for this. In my understanding, kwargs are for things you change frequently. This looks like an issue that you have to sort out once for your deployment and then never touch it again. Just adding it to the config should cover all cases.
There was a problem hiding this comment.
I'm not sure, and I have no strong feeling on this. Currently it looks like we have kind of a match between kwargs and config file. I'm not sure which way we should go.
There was a problem hiding this comment.
I did it this way only because that's how everything else in the yaml's are set. I don't really mind either way. maybe let me know @guillaumeeb ?
There was a problem hiding this comment.
Yeah, that is what I thought. If so, leave it as is, as the everything else is done. Maybe we'll need to check whether all the kwargs are justified or not in the future.
| local-directory: null # Location of fast local storage like /scratch or $TMPDIR | ||
|
|
||
| # PBS resource manager options | ||
| shebang: "#!/bin/bash" |
There was a problem hiding this comment.
Shouldn't we go for #!/usr/bin/env bash ?
There was a problem hiding this comment.
Yep, looks like it, good catch!
| with SLURMCluster(walltime='00:02:00', processes=4, cores=8, memory='28GB') as cluster: | ||
|
|
||
| job_script = cluster.job_script() | ||
| assert job_script.startswith("#!/bin/bash") |
There was a problem hiding this comment.
This should also go into the tests for the other schedulers.
There was a problem hiding this comment.
I believe this should only go in https://github.com/dask/dask-jobqueue/blob/master/dask_jobqueue/tests/test_jobqueue_core.py as this is not specific to any scheduler.
|
Looks good!
|
guillaumeeb
left a comment
There was a problem hiding this comment.
Nice catch @danpf, and many thanks for the PR!
It looks good, just address the points highlighted by @willirath. Just not sure about whether it should go into kwargs or not...
| with SLURMCluster(walltime='00:02:00', processes=4, cores=8, memory='28GB') as cluster: | ||
|
|
||
| job_script = cluster.job_script() | ||
| assert job_script.startswith("#!/bin/bash") |
There was a problem hiding this comment.
I believe this should only go in https://github.com/dask/dask-jobqueue/blob/master/dask_jobqueue/tests/test_jobqueue_core.py as this is not specific to any scheduler.
| local-directory: null # Location of fast local storage like /scratch or $TMPDIR | ||
|
|
||
| # PBS resource manager options | ||
| shebang: "#!/bin/bash" |
There was a problem hiding this comment.
Yep, looks like it, good catch!
| env_extra = dask.config.get('jobqueue.%s.env-extra' % self.scheduler_name) | ||
| if log_directory is None: | ||
| log_directory = dask.config.get('jobqueue.%s.log-directory' % self.scheduler_name) | ||
| if shebang is None: |
There was a problem hiding this comment.
I'm not sure, and I have no strong feeling on this. Currently it looks like we have kind of a match between kwargs and config file. I'm not sure which way we should go.
@willirath you're right, this is not ideal. We did that when implementing multiple config in #68. Feel free to open an issue to discuss other possibilities for that point. |
|
And thanks @willirath for the review here, this is really appreciated. |
Previously shell was default "/bin/bash" now it respects the environment. Also shebang tests were only in slurm, now there is one for each scheduler that we support.
guillaumeeb
left a comment
There was a problem hiding this comment.
I think you can simplify tests a bit with pytest parametrize feature.
| assert ' --preload mymodule' in cluster._command_template | ||
|
|
||
|
|
||
| def test_shebang_settings(): |
There was a problem hiding this comment.
You should use parametrized test as the one below for all this!
| env_extra=['export LANG="en_US.utf8"', | ||
| 'export LANGUAGE="en_US.utf8"', | ||
| 'export LC_ALL="en_US.utf8"'] | ||
| 'export LC_ALL="en_US.utf8"'], |
|
LGTM, @willirath do you see something else? |
I updated the documentation to reflect shebangs being required. because otherwise a batch script could be submitted without an interpreter. I also added a test to cover this.
|
sorry to bother you more, please review the updates to the documentation I added as well. |
There was a problem hiding this comment.
I'm not sure the additional requirement mock is needed to solve the problem at hand. In Xarray's tests, they use a context manager to directly change the dask config:
https://github.com/pydata/xarray/blob/4a5d88dba5fd50d48ab00ed2ebaee058287ab0bf/xarray/tests/test_dask.py#L838
So you could just use
with pytest.raises(ValueError, match="batch script interpreter"):
with (dask.config.set(shebang=None)):
Cluster(cores=2, memory='4GB')|
Apart from the question about the added dependency on |
| assert job_script.startswith(default_shebang) | ||
|
|
||
| with pytest.raises(ValueError, match="batch script interpreter"): | ||
| with (dask.config.set(shebang=None)): |
There was a problem hiding this comment.
Sorry. My example code was wrong. dask.config.set('jobqueue.%s.shebang' % Cluster.scheduler_name, None) should work.
There was a problem hiding this comment.
thanks, my initial idea was way worse.
@guillaumeeb Good to go from py point of view! |
guillaumeeb
left a comment
There was a problem hiding this comment.
We're almost there, but I feel there are two things to address:
- I don't think we want to raise an exception if no shebang is define (but I may be wrong)
- We shouldn't put the shebang kwarg or config value in all the docs, as the default is perfectly fine for most cases.
Sorry to be a bit pernickety...
|
|
||
| if shebang is None: | ||
| raise ValueError("You must specify a batch script interpreter to use" | ||
| " like ``#!/usr/bin/env bash``") |
There was a problem hiding this comment.
Is this really a problem if there is no shebang?
What concerns me here is that this exception might break existing jobqueue configurations, where no shebang is declared. From my point of view it shouldn't.
There was a problem hiding this comment.
We could maybe add a deprecation warning? and then a warning to say that if shebang is None everywhere, automatically set it to #!/usr/bin/env bash to preserve the old configuration
You are right that this breaks existing configs in its current state. But you cannot submit a batch script to most schedulers without a shebang:
sbatch: error: This does not look like a batch script. The first
sbatch: error: line must start with #! followed by the path to an interpreter.
sbatch: error: For instance: #!/bin/sh
There was a problem hiding this comment.
Wait, I may be wrong about this. I need to try this PR to see if it breaks my config. Maybe it doesn't, default value should be taken from the module default jobqueue.yaml file.
There was a problem hiding this comment.
looking at the code I think you're right. It should default to bash. I'll wait for you to try it though...
There was a problem hiding this comment.
Sorry did not find time today, and this will not be possible until friday. You could verify it though, just create a jobqueue.yaml config file, and do not add a shebang line.
I believe the only way to reach that code is to create a local config file, and set shebang to null in it. I'm thus in favor of removing this code, as it seems very unlikely a user run in this case without knowing it.
| name: dask-worker | ||
| cores: 36 | ||
| memory: 270GB | ||
| shebang: '#!/usr/bin/env bash' |
There was a problem hiding this comment.
You should remove the shebang from "real life" configurations, as keeping the default shebang is perfectly fine apparently for those. This include ARM, but also above Cheyenne and Nersc.
There was a problem hiding this comment.
Are these what the DOE are using as their jobqueue.yaml ? in that case they 'should' set the shebang k-v in this updated version. Currently the config is set per-scheduler and the shebang is pulled from the scheduler's k-v config. This is solved if we use the deprecation method I suggested. Although I think it's best practice for the documentation to show these as set so people understand.
There was a problem hiding this comment.
I think they shouldn't if they keep the default value, see my response to the other comment. I will try this tomorrow. Default shebang value should be inherited from default config if not specified.
There was a problem hiding this comment.
And so based on the above discussion, I think we really should remove the majority of examples containing shebang in the doc, only keeping a few specific. Maybe add the config you are using in your cluster in the example section?
There was a problem hiding this comment.
I deleted a bunch, I only left it in places where the docs describe the extra args as well!!
|
|
||
| cluster = PBSCluster(processes=18, | ||
| threads=4, | ||
| shebang='#!/usr/bin/env bash', |
There was a problem hiding this comment.
Still the same, shebang is optional, maybe put it once, but not everywhere.
|
You've got flake8 checks failing here. |
|
Looks like this is all fine now? |
|
Yes ! Thanks to both of you here, merging. |
Previously the shell(shebang) was hardcoded to be '#!/bin/bash'.
This fixes that, because some of the queueing schedulers had their own
shebangs hardcoded as well, so you actually ended up having two shebang
lines. In order to accomodate more shells, we can
add an option to specify shell via shebangs. If you think this
implementation is fine, I can/will add this to the documentation as
well.