fix: implement explicit translation for NEP-18 #2089

agoose77 · 2023-01-08T00:30:10Z

This PR closes #2088 by defining explicit NEP-18 translations ("implementations"). These map the NumPy argument spec onto the Awkward function, and translate incompatible arguments (e.g. NumPy's kind vs Awkward's stable for ak.sort).

The TLDR of the motivation for this PR is that some NumPy functions like np.std have incompatible differences with our implementations e.g. ak.std. By explicitly defining a translation, we can decouple the two APIs.

This adds a slight runtime cost; now a call to np.sort() jumps through two additional functions. However, this is not an area of performance that we should care about (and CPython 3.11 speeds function calls up slightly, so it's a solved problem /s).

We could also use this mechanism to define the translations for non-Awkward overloaded NumPy functions, which just need better translations that our default heuristic-based approach takes.

codecov · 2023-01-08T00:37:45Z

Codecov Report

Merging #2089 (3056e89) into main (c4cb717) will decrease coverage by 0.08%.
The diff coverage is 70.96%.

Additional details and impacted files

Impacted Files	Coverage Δ
src/awkward/contents/indexedarray.py	`77.66% <0.00%> (ø)`
src/awkward/operations/ak_sort.py	`60.00% <27.27%> (-40.00%)`	⬇️
src/awkward/operations/ak_argsort.py	`75.00% <54.54%> (-25.00%)`	⬇️
src/awkward/operations/ak_count_nonzero.py	`77.27% <66.66%> (-2.73%)`	⬇️
src/awkward/operations/ak_isclose.py	`94.44% <66.66%> (-5.56%)`	⬇️
src/awkward/operations/ak_nan_to_num.py	`98.03% <66.66%> (-1.97%)`	⬇️
src/awkward/operations/ak_argmax.py	`60.00% <71.42%> (ø)`
src/awkward/operations/ak_argmin.py	`60.00% <71.42%> (ø)`
src/awkward/operations/ak_max.py	`60.00% <71.42%> (ø)`
src/awkward/operations/ak_mean.py	`53.65% <71.42%> (+0.88%)`	⬆️
... and 15 more

jpivarski

Thanks for defining this translation layer. You mention performance from indirection, but that ought to be pretty minimal, while the benefits of a translation layer are pretty significant.

We'd like to synchronize the Awkward arguments and the NumPy arguments as much as possible, though, because we want to minimize user surprise. For instance,

NumPy's kind vs Awkward's stable for ak.sort

should eventually match NumPy's arguments better, though for sorting, I was under the impression that their arguments are in flux. (They don't want to explicitly say what algorithm they're using, such as "heapsort".)

In this PR, it looks like you added a translation layer for every NEP-18 overload, not just the ones that have different arguments. No, I don't see any examples that have exactly the same arguments (and maybe we'd always have highlevel and behavior, while NumPy would never have these two arguments).

agoose77 · 2023-01-09T09:09:04Z

You mention performance from indirection, but that ought to be pretty minimal, while the benefits of a translation layer are pretty significant.

Yes, I like to make it clear that this does have an impact, but only if users are doing things in a loop. It's more of a "take note" rather than "this is a regression".

We'd like to synchronize the Awkward arguments and the NumPy arguments as much as possible, though, because we want to minimize user surprise.

Definitely (in the long run). It should be a goal that we avoid un-motivated divergences. Sorting might be an aberration here, as you point out.

In this PR, it looks like you added a translation layer for every NEP-18 overload, not just the ones that have different arguments.

The idea with these translation functions is that we're not directly coupled to NumPy's interface, which may have legacy arguments or other features that we don't support. If we do this for all functions, we'll keep some symmetry on our end. I think you're right, that we should try and use this predominantly for compatibility than for needlessly deviating from the NumPy API.

No, I don't see any examples that have exactly the same arguments (and maybe we'd always have highlevel and behavior, while NumPy would never have these two arguments).

Yes, and the main culprit here is out, which is nearly always present in NumPy operations, and we never define. This argument is guaranteed to break positional correspondence of subsequent arguments, so these custom NEP-18 translators (?) ensure that we can use the signature of the NumPy function exactly.

should eventually match NumPy's arguments better, though for sorting, I was under the impression that their arguments are in flux. (They don't want to explicitly say what algorithm they're using, such as "heapsort".)

Should we make a change here (in deprecation cycles)?

jpivarski · 2023-01-09T15:47:00Z

Should we make a change here (in deprecation cycles)?

It doesn't have to be done now. Eventually, I'd like to replace the sorting kernels—which currently defer to C++ sorting algorithms—with something that would be more like what we'll need to do in CUDA, but CUDA sorting is something that will require extra thought in itself.

agoose77 added 4 commits January 7, 2023 23:43

wip: add unsupported sentinel

bb3af92

feat: add NEP18 translations to all ak.operations

b19cbb1

docs: elaborate error messages

af78267

feat: pre-validate NEP-18 function for unsupported arguments

e1f6ce6

agoose77 requested a review from jpivarski January 8, 2023 00:30

fix: pass depth to maybe_posaxis

3056e89

agoose77 temporarily deployed to docs-preview January 8, 2023 00:45 — with GitHub Actions Inactive

agoose77 temporarily deployed to docs-preview January 8, 2023 01:15 — with GitHub Actions Inactive

jpivarski approved these changes Jan 8, 2023

View reviewed changes

agoose77 merged commit 7d38ba1 into main Jan 9, 2023

agoose77 deleted the agoose77/fix-nep-18-explicit branch January 9, 2023 09:38

agoose77 mentioned this pull request Jan 9, 2023

refactor: remove kind and order args to sorting protocols #2090

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: implement explicit translation for NEP-18 #2089

fix: implement explicit translation for NEP-18 #2089

agoose77 commented Jan 8, 2023

codecov bot commented Jan 8, 2023 •

edited

Loading

jpivarski left a comment

agoose77 commented Jan 9, 2023 •

edited

Loading

jpivarski commented Jan 9, 2023

fix: implement explicit translation for NEP-18 #2089

fix: implement explicit translation for NEP-18 #2089

Conversation

agoose77 commented Jan 8, 2023

codecov bot commented Jan 8, 2023 • edited Loading

Codecov Report

jpivarski left a comment

Choose a reason for hiding this comment

agoose77 commented Jan 9, 2023 • edited Loading

jpivarski commented Jan 9, 2023

codecov bot commented Jan 8, 2023 •

edited

Loading

agoose77 commented Jan 9, 2023 •

edited

Loading