New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dispatching to list subtypes #72
Comments
I think that for tuples, it might be useful to have Edit: Or, more generally, |
I don't like that this merges the idea of typed lists and regular lists. One of the advantages of the implementation in blaze is that it makes the two kinds of dispatches distinct. This is important when working on a large code base with thousands of dispatch definitions, many of them already dispatch on @dispatch(TypedList[int])
def f(xs):
print('typed')
@dispatch(list)
def f(xs):
print('regular')
f([1, 2, 3]) This currently prints In [11]: issubclass(TypedList[int], list)
Out[11]: False
In [12]: issubclass(list, TypedList[int])
Out[12]: False Given that we can get the the same auto-boxing behavior with two lines of Python, I am not sure this is a big feature. If anything, it forces us to explicitly register the Regarding the use of Finally, the I would also like to make my stance on |
I, on the other hand, support typing. But (on the other hand) Joe is right about overlapping, and how it could be difficult in a large codebase. I suggest writing a simple decorator on anything that will convert |
I'm also happy with a solution based more directly on @llllllllll's The harder part is writing the As for typing, I'll note that there are at least two ways to support it:
@dispatch(List[Union[str, int]])
def f(x):
...
@dispatch()
def f(x: List[Union[str, int]])
... I would support version (1) directly in |
Note that multipledispatch does support PEP 484 type hints: #65 |
Oh, interesting. I'm not entirely sure that's a good idea to include for the default
I can see cases where I might use different types, e.g., |
Multiple dispatch supports subclasses correctly. It uses the same linearization algorithm as the Python MRO to partially order the types to search. |
Oops, my mistake. |
Subtypes can come in useful. For example, I wouldn't feel okay asking XArray to change its code every time I add a new matrix format. And it will be useful in sparse for checking input arguments are a In light of @llllllllll's comments, I'm now in favour of @llllllllll's implementation, maybe while adding a simple wrapper that extracts the |
Hi everybody,
Right now the code is a bit of hack on top of multipledispatch. [*] I created wrapper called list_of instead of using the typing module, which I thought was not available in python 2, and I felt it was closer to the existing multipledispatch notation to have list_of((type1, type2)) rather than List[Union[type1, type2]]. Even though the typing module is probably superior to my approach, maybe some of my test cases could still come useful. [**] I am proposing a gist and not a PR because in any case the code would need a bit of cleanup |
@llllllllll This is one part that I don't love about your solution, for two reasons:
So again, I really don't care exactly how we spell this, but clearly there's a strong need. To keep things moving forward, let me make another proposal of what the public API could look like here: from multipledispatch import dispatch, VarArgs
@dispatch(VarArgs[int])
def f(varargs):
print('integers:', varargs.value)
@dispatch(VarArgs[str])
def f(varargs):
print('strings:', varargs.value)
@dispatch(list)
def f(value):
return f(VarArgs(value))
f([1, 2]) # integers: [1, 2]
f(['foo', 'bar']) # strings: ['foo', 'bar']
f([[1, 2]]) # NotImplementedError: Could not find signature for f: <VarArgs[list]> My main goal with this version is to keep things explicit and maintainable. We'll leave syntactic sugar up to downstream libraries:
We do need a metaclass to implement |
We could easily make |
Don't you mean |
In this example, my function is explicitly unwrapping the original value with |
Ah, missed that. My bad. |
Is there progress on this? I'd be willing to lend a hand if required. I'd really like XArray/sparse integration to be okay-ish before late May. I'm submitting a conference paper to SciPy 2018 and it'd be really nice to show this in action. @llllllllll A go-between would be that instead of converting the sequence, you wrap it. That would get rid of the overhead, as @shoyer suggested. |
I was mostly waiting to see what @llllllllll thought of my last proposal #72 (comment). But I agree, it would be nice to get something in here so we can use it downstream. |
Allowing mutation of the type can lead to issues. What happens if you create an instance of |
Yes, this would be bad, but I think it would be enough to clearly document that you shouldn't do this? I suppose we could also restrict |
I think it should at least always be the same interface, you use tuples and frozensets very differently. Also, is the performance cost of copying the argument list into a tuple that high? What's the longest list you plan on passing to |
In xarray, ~10,000 elements is probably a reasonable upper bound on So I'm OK always converting the args to |
+1 for composition |
Personally I'm fine with copying. I don't think that this will have an impact on performance. @hameerabbasi you may be concerned about copying all of the arrays when passing a list of arrays. I agree that this would be bad. However I don't think that that is actually what is being discussed here. I believe that this would just cause a copy of the the container holding all of the references. I don't think that asymptotic arguments should necessarily hold sway here. In [1]: import numpy as np
In [2]: x = np.arange(100000000)
In [3]: L = [x] * 10000
In [4]: %time _ = tuple(L)
CPU times: user 159 µs, sys: 31 µs, total: 190 µs
Wall time: 199 µs I may not be fully understanding the conversation here though. |
I understand that fully (the part where you only copy references). 😅 I might be biased here, but I'm pretty sure that:
I understand the lists themselves are likely to be very small (and the memory/performance overhead insignificant). But my OCD screams at me: " |
OK, good to know. Does that mean that you're removing your -1 above? I generally interpret -1 as "I am sufficiently against this idea that I block it from moving forward. I require that we find an alternative."
OK, these might be more serious reasons. I'm not up-to-date on this topic enough to judge. |
Yes, I remove the -1. Another reason I thought of: If you were to do it again to a mutated collection, it'll take another |
I am fine with having a tuple attribute instead of subclassing and implementing a sequence API that forwards to this tuple. This has the added benefit of not making the type partially orderable with tuple itself. Allowing mutation will just cause hard to find bugs unless we wrap the mutable collection in some magic object that changes type based on the inserted elements, but then we are back to O(n) to check all the inserted elements so we might as well copy. I also know that some people dislike dynamically changing your @dispatch(TypedSequence[int])
def foldl(f, sequence, starting_value=None):
if starting_value is not None:
sequence.insert(0, starting_value)
return foldl(f, sequence)
# ...
foldl(f, [1, 2, 3], starting_value=Class()) In the recursive call, the sequence will still be typed as I would also like to make a side-note about the name: |
Wouldn't this sort of thing be fixed by making the decorator |
multipledispatch doesn't kick in for keyword arguments. I think the function I wrote is a very real example, you could definite have an optimized fast path for |
Ah, fair enough. I'm on board with copying, but I'd rather we built a keyword argument feature into this library. Something like |
I agree that it would be useful to support keyword arguments, but that should be discussed in a new issue. |
+1 for TypedSequence as the name.
…On Thu, Apr 19, 2018 at 2:24 PM Joe Jevnik ***@***.***> wrote:
I agree that it would be useful to support keyword arguments, but that
should be discussed in a new issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#72 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1h2na35sr6jKagcaemWZ7JS4_FQiks5tqQCUgaJpZM4STuoH>
.
|
+1 for TypedSequence and getting this in as quickly as possible. 😅 |
It sounds like we have consensus on the design -- somebody just needs to
implement it now!
…On Tue, Apr 24, 2018 at 4:44 AM Hameer Abbasi ***@***.***> wrote:
+1 for TypedSequence and getting this in as quickly as possible. 😅
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#72 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1s6GG5xwME7SoJ80NP4J7958E0Tvks5trxAFgaJpZM4STuoH>
.
|
- remove typings no currently supported [PR #72](mrocklin/multipledispatch#72)
As an (temporary?) alternative to full support for Python typing (#69), I'd like to propose adding
multipledispatch.TypedList
. The use case here is separate functions for different list subtypes, e.g., lists of strings vs lists of integers. See pydata/xarray#1938 for discussion.I have an example implementation here and am happy to work on putting together a PR if desired:
https://colab.research.google.com/drive/18zdyUpWLNFzFaz08GUOC5vs1GxE_jHg-#scrollTo=XDL0cBeS-lub
Example usage:
The exact public facing API is up for discussion. I'm tentatively calling this
TypedList
for clarity and to distinguish this fromtyping.List
(this is actually equivalent totyping.List[typing.Union[...]
).The text was updated successfully, but these errors were encountered: