Conversation
|
This is ready for review and merge |
Would it make sense to define fuse configuration parameters tied to a specific collection to make configuring this more straightforward across the board? |
|
@lr4d yes that's definitely a clean option, although it would break backwards compatibility |
|
My apologies for the delay @crusaderky . I think that @jcrist is in charge
of the maintenance team this week. Hopefully he or others can handle this
soon?
…On Fri, May 15, 2020 at 4:02 AM crusaderky ***@***.***> wrote:
@lr4d <https://github.com/lr4d> yes that's definitely a clean option,
although it would break backwards compatibility
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6198 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTDQCOIKEZAOVV3EQDDRRUOMLANCNFSM4M6J4SFA>
.
|
jcrist
left a comment
There was a problem hiding this comment.
Thanks @crusaderky, overall this looks like a nice change. Just one small nit then ready for merge.
dask/optimization.py
Outdated
| return (_enforce_max_key_limit(concatenated_name),) + first_key[1:] | ||
|
|
||
|
|
||
| _use_config = object() |
There was a problem hiding this comment.
In other places we use "__no_default__" for this. Since tooling like sphinx, ipython, etc... will repr the defaults in the signature, having something more human-readable can be nice.
Alternatively, in skein I use something like:
# overriding `__reduce__` lets the option be properly picklable as a singleton as well.
default = type('default', (object,),
dict.fromkeys(['__repr__', '__reduce__'],
lambda s: 'default'))()
for the same purpose.
Either is fine, but I'd prefer something other than a raw object().
There was a problem hiding this comment.
I'll change it to a PEP-compliant pattern https://www.python.org/dev/peps/pep-0484/#support-for-singleton-types-in-unions
There was a problem hiding this comment.
Done, see if you like it now
There was a problem hiding this comment.
Sorry, the issue wasn't around typing or any of that, but more how it displays to the user in ipython/sphinx/etc.... Using an enum for this is fine, provided it has a descriptive repr (UseConfig.token does not).
We don't currently use python type annotations in dask, adding them is a separate issue and shouldn't be done here (IMO).
Perhaps
In [12]: class Default(enum.Enum):
...: value = ""
...: def __repr__(self):
...: return "<default>"
...:
...:
In [13]: def foo(x=Default.value):
...: pass
...:
In [14]: foo?
Signature: foo(x=<default>)
Docstring: <no docstring>
File: ~/Code/dask/<ipython-input-13-ed0fa1715aa4>
Type: functionThere was a problem hiding this comment.
Modified as requested.
Sphinx HTML rendered output:
dask.optimization.fuse(dsk, keys=None, dependencies=None, ave_width=<default>,
max_width=<default>, max_height=<default>, max_depth_new_edges=<default>,
rename_keys=<default>, fuse_subgraphs=<default>)
Couldn't we introduce some config parameters in the form of |
|
@lr4d What you say makes sense and it's a clean design if backwards compatibility is not an issue. |
|
Thanks! |
|
On master, Travis reports some failed builds with "KeyError: 'optimization'" in config.py (eg https://travis-ci.org/github/dask/dask/jobs/688159707, https://travis-ci.org/github/dask/dask/jobs/688159706). And the reason I noticed is that I see the same error in our nightly tests of dask master with pyarrow master in Apache Arrow. |
|
This is because I've opened #6221 to discuss making yaml a required dependency, since that seems more tenable than forcing duplication of defaults between the config file and the code. |
Overhaul the config file and default values for fuse().
subgraphs: truewould explicitly setfuse_subgraphs=Falsewhen invoking fuse() and have it ignoredmax-widthandmax-depth-new-edgesand explicitly pass None when invoking fuse to make them revert to the dynamic calculationCaveats: none. I was extremely conservative in the change so there shouldn't be anything at risk of breaking. This came at the cost of a rather unsightly and inconsistent design in the optimization functions for array and bag - where I strictly retained the previous behaviour.
From looking at the code I see that many other config settings could use the same kind of attention - but they would substantially increase the scope of the change, so I'd rather leave them to a later PR.