Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Configuration #6

Merged
merged 5 commits into from
Jul 1, 2018
Merged

Add Configuration #6

merged 5 commits into from
Jul 1, 2018

Conversation

mrocklin
Copy link
Member

@mrocklin mrocklin commented Jul 1, 2018

This adds a basic config file and uses that configuration throughout the
dask-yarn codebase.

This also switches the convention away from providing memory size in
megabytes and instead using strings with units. We provide informative
error messages to guide users to correct behavior.

cc @jcrist for review. In particular I want to check in and make sure that you're ok with the memory handling conventions. They're different from skein. We can walk back from this a bit if you like.

This adds a basic config file and uses that configuration throughout the
dask-yarn codebase.

This also switches the convention away from providing memory size in
megabytes and instead using strings with units.  We provide informative
error messages to guide users to correct behavior.
MANIFEST.in Outdated
recursive-exclude * __pycache__
recursive-exclude * *.py[co]
include versioneer.py
include dask_yarn/_version.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

versioneer isn't currently used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

worker_memory : int, optional
The ammount of memory in MB to allocate per worker.
worker_memory : str, optional
The ammount of memory to allocate per worker like '2GB' or '4096 MiB'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we support both strings and int (in MB)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, though I think it's a strange default. Of course, it could be argued that it's a default choice that's made for us by Yarn. Current convention in Dask is that if the user provides a number then that number corresponds to the number of bytes, not megabytes.

It's pretty obvious today what the user wanted, and we can automatically convert any number less than 1e6 to megabytes if that would be nicer. I somewhat prefer the current convention because it is fully explicit, but I can see how it might be unpleasant to a user familiar with YARN conventions. Happy to make the change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier to add later than to remove. Fine as is.

@@ -1,11 +1,15 @@
from setuptools import setup

with open('requirements.txt') as f:
install_requires = f.read().strip().split('\n')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer to keep it all in the setup.py file, but this isn't a strong preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most projects today seem to keep things in requirements.txt . I'm inclined to stay with that convention.

I also like this personally because it's easy to look up what a package requires quickly in a data file rather than looking through a setup.py file. Although in this case the setup.py is simple enough that this isn't a very solid argument. The "common convention" one though I think does stand.

msg = ("You must provide a path to a redeployable environment for the "
"workers.\n"
"This is commonly achieved through conda-pack.\n\n"
"See https://dask-yarn.readthedocs.org/en/latest/environments.html "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this docs page exist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet! I was going to play with documentation next

@jcrist jcrist merged commit ad9e62e into master Jul 1, 2018
@jcrist jcrist deleted the config branch July 1, 2018 21:45
@jcrist jcrist mentioned this pull request Jul 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants