Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: implement explicit translation for NEP-18 #2089

Merged
merged 5 commits into from
Jan 9, 2023

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Jan 8, 2023

This PR closes #2088 by defining explicit NEP-18 translations ("implementations"). These map the NumPy argument spec onto the Awkward function, and translate incompatible arguments (e.g. NumPy's kind vs Awkward's stable for ak.sort).

The TLDR of the motivation for this PR is that some NumPy functions like np.std have incompatible differences with our implementations e.g. ak.std. By explicitly defining a translation, we can decouple the two APIs.

This adds a slight runtime cost; now a call to np.sort() jumps through two additional functions. However, this is not an area of performance that we should care about (and CPython 3.11 speeds function calls up slightly, so it's a solved problem /s).

We could also use this mechanism to define the translations for non-Awkward overloaded NumPy functions, which just need better translations that our default heuristic-based approach takes.

@codecov
Copy link

codecov bot commented Jan 8, 2023

Codecov Report

Merging #2089 (3056e89) into main (c4cb717) will decrease coverage by 0.08%.
The diff coverage is 70.96%.

Additional details and impacted files
Impacted Files Coverage Δ
src/awkward/contents/indexedarray.py 77.66% <0.00%> (ø)
src/awkward/operations/ak_sort.py 60.00% <27.27%> (-40.00%) ⬇️
src/awkward/operations/ak_argsort.py 75.00% <54.54%> (-25.00%) ⬇️
src/awkward/operations/ak_count_nonzero.py 77.27% <66.66%> (-2.73%) ⬇️
src/awkward/operations/ak_isclose.py 94.44% <66.66%> (-5.56%) ⬇️
src/awkward/operations/ak_nan_to_num.py 98.03% <66.66%> (-1.97%) ⬇️
src/awkward/operations/ak_argmax.py 60.00% <71.42%> (ø)
src/awkward/operations/ak_argmin.py 60.00% <71.42%> (ø)
src/awkward/operations/ak_max.py 60.00% <71.42%> (ø)
src/awkward/operations/ak_mean.py 53.65% <71.42%> (+0.88%) ⬆️
... and 15 more

@agoose77 agoose77 temporarily deployed to docs-preview January 8, 2023 00:45 — with GitHub Actions Inactive
@agoose77 agoose77 temporarily deployed to docs-preview January 8, 2023 01:15 — with GitHub Actions Inactive
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for defining this translation layer. You mention performance from indirection, but that ought to be pretty minimal, while the benefits of a translation layer are pretty significant.

We'd like to synchronize the Awkward arguments and the NumPy arguments as much as possible, though, because we want to minimize user surprise. For instance,

NumPy's kind vs Awkward's stable for ak.sort

should eventually match NumPy's arguments better, though for sorting, I was under the impression that their arguments are in flux. (They don't want to explicitly say what algorithm they're using, such as "heapsort".)

In this PR, it looks like you added a translation layer for every NEP-18 overload, not just the ones that have different arguments. No, I don't see any examples that have exactly the same arguments (and maybe we'd always have highlevel and behavior, while NumPy would never have these two arguments).

@agoose77
Copy link
Collaborator Author

agoose77 commented Jan 9, 2023

You mention performance from indirection, but that ought to be pretty minimal, while the benefits of a translation layer are pretty significant.

Yes, I like to make it clear that this does have an impact, but only if users are doing things in a loop. It's more of a "take note" rather than "this is a regression".

We'd like to synchronize the Awkward arguments and the NumPy arguments as much as possible, though, because we want to minimize user surprise.

Definitely (in the long run). It should be a goal that we avoid un-motivated divergences. Sorting might be an aberration here, as you point out.

In this PR, it looks like you added a translation layer for every NEP-18 overload, not just the ones that have different arguments.

The idea with these translation functions is that we're not directly coupled to NumPy's interface, which may have legacy arguments or other features that we don't support. If we do this for all functions, we'll keep some symmetry on our end. I think you're right, that we should try and use this predominantly for compatibility than for needlessly deviating from the NumPy API.

No, I don't see any examples that have exactly the same arguments (and maybe we'd always have highlevel and behavior, while NumPy would never have these two arguments).

Yes, and the main culprit here is out, which is nearly always present in NumPy operations, and we never define. This argument is guaranteed to break positional correspondence of subsequent arguments, so these custom NEP-18 translators (?) ensure that we can use the signature of the NumPy function exactly.

should eventually match NumPy's arguments better, though for sorting, I was under the impression that their arguments are in flux. (They don't want to explicitly say what algorithm they're using, such as "heapsort".)

Should we make a change here (in deprecation cycles)?

@agoose77 agoose77 merged commit 7d38ba1 into main Jan 9, 2023
@agoose77 agoose77 deleted the agoose77/fix-nep-18-explicit branch January 9, 2023 09:38
@jpivarski
Copy link
Member

Should we make a change here (in deprecation cycles)?

It doesn't have to be done now. Eventually, I'd like to replace the sorting kernels—which currently defer to C++ sorting algorithms—with something that would be more like what we'll need to do in CUDA, but CUDA sorting is something that will require extra thought in itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NEP 18 overloaded functions don't have same function signature
2 participants