New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
custom dispatchers #4
Conversation
Hello @RikVoorhaar! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-02-15 16:00:15 UTC |
Codecov Report
@@ Coverage Diff @@
## master #4 +/- ##
==========================================
- Coverage 97.58% 97.30% -0.28%
==========================================
Files 1 1
Lines 455 482 +27
==========================================
+ Hits 444 469 +25
- Misses 11 13 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the PR! Looks generally great.
Regarding your first couple of points I've suggested something that should dispatch on args[0]
or args[0][0]
or args[0][0][0]
etc., but haven't tested. Let me know your thoughts.
Happy to keep the black formatting, although the two conflicts with PEP8 (my usual linter) would be nice to resolve (sounds like making line length 80 -> 79 is a common one) at maybe some later point.
I think you figured/guessed the make_translator
call correctly. It's not very clear but the idea is you take the ordered args you want (i.e. the numpy syntax) as the left tuple items, and translate them into the backend args + desired defaults (right tuple items).
autoray/autoray.py
Outdated
@@ -75,7 +75,10 @@ def do(fn, *args, like=None, **kwargs): | |||
<tf.Tensor: id=91, shape=(3, 3), dtype=float32> | |||
""" | |||
if like is None: | |||
backend = infer_backend(args[0]) | |||
dispatch_arg = _DISPATCHERS.get(fn, default_dispatcher)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using the defaultdict
this line can just be _DISPATCHERS[fn](...
.
autoray/autoray.py
Outdated
def join_array_dispatcher(*args, **kwargs): | ||
# NOTE: To be super safe, we could do a try / except, or | ||
# hasattr(args[0], '__getitem__'). hasattr is then probably better, | ||
# since there would be multiple types of errors to catch | ||
return args[0][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One solution here that might? solve the two problems you mention is:
def join_array_dispatcher(arg0, *args, **kwargs):
if infer_backend(arg0)== 'builtins':
return join_array_dispatcher(arg0[0])
return arg0
though you will still get an error with an e.g. empty tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also not perfect. It requires additional lookups, whereas the situation where args[0][0] is not defined is rare and not intended usage (at least within numpy API). Furthermore this will fail in the case that the sequences is not of builtins
type, for example if you use a numpy object array containing e.g. torch tensors.
Granted that's not a common use-case, but in those cases it will silently convert the tensors to the wrong type, and I think it's better to just throw an error, because in any case you should not call these functions with an empty tuple or one-dimensional array.
I think we should go for something like
def join_array_dispatcher(*args, **kwargs):
try:
return args[0][0]
except (TypeError, IndexError):
raise ValueError("array joining function should not be called with empty argument")
This way the user gets an informative error if called with wrong argument. Overhead of error handling should be less than the additional lookup you proposed, although I haven't timed it.
This still will silently convert to numpy backend if called with a one-dimensional sparse
array. I think that's however fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, I'm happy with this logic also.
autoray/autoray.py
Outdated
# einsum takes string as first argument a string, but backend should | ||
# be inferred from second argument. | ||
def einsum_dispatcher(*args, **kwargs): | ||
if isinstance(args[0], str): | ||
return args[1] | ||
return args[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to elaborate here, einsum
has the two following formats:
einsum('ab,bc->ca', x, y)
einsum(x, [0, 1], y, [1, 2], [2, 0])
thus line 455 which is uncovered by tests currently. I think fine as is, just thought I'd mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right I forgot about the second format. I think it's worth updating the tests to see if this works as intended for all libraries.
tests/test_autoray.py
Outdated
== ar.infer_backend(C2) | ||
== ar.infer_backend(C3) | ||
== backend | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a black preference (no final newline)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's an honest mistake. Black is an autoformatter to be used in conjunction with the pycodestyle
linter, which also wants a newline at end of file. Most of the time pycodestyle
and the PEP8 guidelines agree. Biggest difference is the standard maximum line width, which is 79 for PEP8. However on a 1080p screen you can easily fit two editors next to each other with line width ~100 chars, so many people consider this line width a bit too strict.
Another very minor point, but all the |
good point, I'll update tests and readme |
I made the changes, please have another look. |
Only test that fails is |
All looks good, thanks very much for those changes and the PR! I'll tweak that unrelated failing test a bit. |
I implemented custom dispatchers as discussed in #3.
I wrote some tests to confirm it works as expected. One potential caveat, functions like
np.stack
can also eat a one-dimensional array, in which case it acts like the identity function. In those cases backend is inferred from first element of this one-dimensional array byjoin_array_dispatcher
. This works correctly for all supported backends except sparse, where backend is inferred as numpy. This is however not the intended usage ofstack
anyway, so I don't think this is a problem.For
join_array_dispatcher
we could optionally check ifargs[0][0]
even exists in the first place, and if not just returnargs[0]
. This is only relevant if you want to supply it empty sequences or 0-dimensional arrays (which numpy doesn't allow, for example).If so, one can check this with
hasattr(args[0], '__getitem__')
. This is a bit nicer than a try: / except: block, especially since we have to catch both IndexErrors and TypeErrors.It seems my auto formatter
black
was wreaking havoc on your code. Biggest thing it did is replacing single quotes for double quotes everywhere. If this bothers you, I will do a commit with only my actual changes. Otherwise, do you use a particular auto formatter for this project?I also took the liberty of adding translations for
np.take
. I'm a bit confused about the syntax ofmake_translator
. For torch implementation oftake
(i.e.index_select
) the names of arguments and order of arguments is both different, and I'm not 100% sure I did it correctly (at least it seems to work).