New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use environment variable to override root dask config location #3798
Comments
Any thoughts on this issue? I'd be happy to issue a PR if there is agreement on how to proceed. |
Thoughts @mrocklin? |
My apologies for the lack of response here @jhamman . I've been swamped. I don't have any particular objection to this if we can get a couple other people to weigh in. Maybe @jacobtomlinson, @lesteve, and @jcrist for variety across deployment scenarios? |
No strong thoughts. If I were to rewrite those lines, I'd cut back the number of potential directories that configurations are loaded from. As is the user gets a mix of files from potentially a few spots. This seems potentially error prone, as one user might store I'd prefer to have a single root config directory, and a single user config directory. The location of each could be set by an environment variable, and if not set a list of potential directories will be searched, stopping when a path exists. I don't see the need for more than 2 configuration locations (one admin one user), and cutting back the number of locations that may be loaded from seems easier to reason about. Something like (untested): def get_config_path(envvar, search_paths):
if envvar in os.environ:
# if explicitly specified, only use that
return os.environ[envvar]
else:
# Find the first directory that has dask configurations, and use that exclusively.
for path in search_paths:
if os.path.exists(path):
return path
# No path found
return None
# find the root configuration directory, either from an environment
# variable, or the first existing directory in a search list
root_config_path = get_config_path('DASK_ROOT_CONFIG',
[os.path.join('etc', 'dask'),
os.path.join(sys.prefix, 'etc', 'dask')])
# find the user configuration directory, either from an environment
# variable, or the first existing directory in a search list
user_config_path = get_config_path('DASK_CONFIG',
[os.path.join(os.path.expanduser('~'), '.config', 'dask'),
os.path.join(os.path.expanduser('~'), '.dask')])
paths = []
if root_config_path is not None:
paths.append(root_config_path)
if user_config_path is not None:
paths.append(user_config_path)
# continue on with code as written... |
I think I agree with @jcrist's comments. Personally I'm a big fan of storing config in the environment. So I tend to use the environment variable version of this stuff. However when using config files locally I prefer to have a single system config dir and then a user specific config dir. Generally |
Reasons for the following:
|
In general I agree with the sentiment of reducing config locations, but pragmatically I have yet to run into a case where conflicts have confused anyone. I've seen active use cases for all of them except for |
It sounds like in general people are ok with the name |
I'm not saying we should remove support for any of them, just that I doubt people have a need for more than one root/user config location (and the possibility of using more than one root/user config location seems dangerous). If I see a
My vote is for override, not additional, following the logic above. |
I think you understand the motivation for this issue already but for others, our system administrators are not willing to put a dask config in /etc/. They are willing however to set a environment variable in the default user environment. |
I have complete sympathy for sysadmins refusing to do things. I used to be a sysadmin at the Met Office. However if they were to install dask from an RPM or DEB package this is where it would drop the config. So I'm not entirely sure why they would refuse this. Perhaps this is a valid use case for the one @mrocklin hasn't seen in the wild yet.
|
On my particular HPC machine, packages are stored in modules akin to virtual environments. When I load a module (e.g. Python), my environment (path and env variables) are altered. |
We also use |
Dask's new configuration file structure looks in
/etc/dask
for a system wide configuration. I'd like to suggest this be overridable via a environment variable. I've included the relevant lines below.dask/dask/config.py
Lines 18 to 29 in 69fc200
In particular, I'd like to see line 19 changed to something like:
This would allow system admins to point to a common config file without mandating it be placed in
/etc/...
.I'll note that there is already a
DASK_CONFIG
environment variable. This is however appended to the list of dask configs so it would difficult to use for global/system wide configurations.Btw, my particular use case is for providing a system wide configuration for dask-jobqueue on a large HPC system.
The text was updated successfully, but these errors were encountered: