Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Extend np.add ufunc to work with unicode and byte dtypes #24858

Merged
merged 14 commits into from Oct 17, 2023

Conversation

lysnikolaou
Copy link
Contributor

No description provided.

@lysnikolaou
Copy link
Contributor Author

One question to ask ourselves here is whether we want to also implement loops for mixed dtypes like unicode + bytes, unicode + object, bytes + object, unicode + void and bytes + void.

@mattip
Copy link
Member

mattip commented Oct 5, 2023

I don't think the ufunc fastpath has anything to offer once object dtype is used. And I don't know how common it is to add string_ and bytes_, so my first instinct would be to keep this as simple as possible.

@lysnikolaou
Copy link
Contributor Author

Here's another run of the benchmarks.

| Change   | Before [64fc516a] <main>   | After [6b970817] <string-ufuncs-add>   |   Ratio | Benchmark (Parameter)                               |
|----------|----------------------------|----------------------------------------|---------|-----------------------------------------------------|
| -        | 18.0±0.3μs                 | 3.04±0.05μs                            |    0.17 | bench_core.NumPyChar.time_add_small_list_big_string |
| -        | 3.29±0.01ms                | 9.77±0.5μs                             |    0    | bench_core.NumPyChar.time_add_big_list_small_string |

@ngoldbaum
Copy link
Member

And I don't know how common it is to add string_ and bytes_, so my first instinct would be to keep this as simple as possible.

I agree, we shouldn't assume the encoding of bytes_ data, which you would need to do I think.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of fly-by comments, since I had a look. But overall, it seems good.

numpy/core/src/umath/ufunc_type_resolution.c Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
@lysnikolaou
Copy link
Contributor Author

This appears to be ready to merge. Are there any more review comments for me to address?

@charris
Copy link
Member

charris commented Oct 16, 2023

It needs a release note. Look in doc/release/upcoming_changes/ for examples.

@lysnikolaou
Copy link
Contributor Author

lysnikolaou commented Oct 16, 2023

@charris Added a minimal release note. Should it be more comprehensive than that?

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, maybe @ngoldbaum can have a quick look, generally should be good to go though, but I would like the resolve dtypes to be clarified as "this doesn't make sense, but we need it for the DType class/kind".
Also, tests for things that were clearly buggy but fixed in later review would be good.

doc/release/upcoming_changes/24858.new_feature.rst Outdated Show resolved Hide resolved
numpy/core/defchararray.py Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
numpy/core/src/umath/ufunc_type_resolution.c Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
numpy/core/src/umath/string_ufuncs.cpp Outdated Show resolved Hide resolved
Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one super minor nit otherwise this looks great!

numpy/core/src/umath/ufunc_type_resolution.c Outdated Show resolved Hide resolved
@ngoldbaum
Copy link
Member

Thanks @lysnikolaou!

@ngoldbaum ngoldbaum merged commit 19bfa3f into numpy:main Oct 17, 2023
58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants