Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumPy scalars #2060

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

NumPy scalars #2060

wants to merge 18 commits into from

Conversation

aldanor
Copy link
Member

@aldanor aldanor commented Jan 8, 2020

Closes #2036, #1512

  • Add py::numpy_scalar<> and py::make_scalar() as strict type casters between C and NumPy arithmetic types + tests + docs
  • Fix a bug where PyArray_DescrNewType API code was wrong (although it's not really used at the moment, but still)

@aldanor
Copy link
Member Author

aldanor commented Jan 8, 2020

I think this is not the most efficient way of doing it as it creates dtypes etc... but at least it's fully strict and correct.

I guess instead of messing with dtypes you could extract Python types for all require numpy scalar types and cache them somewhere, and then do a simple (non-numpy) type equivalence check or something of the sort.

Edit: on a second glance, PyArray_DescFromType() is actually already caching arithmetic types internally, so np.dtype('i') is np.dtype('i') is True and it's not all that bad.

Copy link
Collaborator

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks!
Would it be possible to provide a type to match all py::numpy_scalar types, similar to py::buffer, as well? This would reduce the number of overloads needed in user code.

docs/advanced/pycpp/numpy.rst Outdated Show resolved Hide resolved
docs/advanced/pycpp/numpy.rst Show resolved Hide resolved
@aldanor
Copy link
Member Author

aldanor commented Jan 8, 2020

Would it be possible to provide a type to match all py::numpy_scalar types, similar to py::buffer, as well? This would reduce the number of overloads needed in user code.

I'm not sure it's possible the way it's currently done since it just matches typenums in a very strict way.. you mean something like py::numpy_polymorphic_scalar? But then to extract anything out of it you'll need a whole layer of API which resembles array's API.

@aldanor
Copy link
Member Author

aldanor commented Jan 8, 2020

Ok, tried to improve type caster a little, using a basic isinstance check, hopefully it's sufficient? (the arithmetic types and dtypes are now also cached statically)

A little check:

m.def("f32_in", [](py::numpy_scalar<float>) { return 0; });
m.def("f32_out", []() { return py::numpy_scalar<float>(0.); })

Before:

>>> %timeit f32_in(f)
430 ns ± 8.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

>>> %timeit f32_out()
162 ns ± 0.424 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

After:

>>> %timeit f32_in(f)
355 ns ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

>>> %timeit f32_out()
149 ns ± 0.819 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

@aldanor
Copy link
Member Author

aldanor commented Jan 9, 2020

@ax3l Note that the current type names in type signatures are actually only correct on a standard 64-bit platform, so they need to be changed to avoid confusion. I.e., float is not float32 in general and double is not float64.

While we can surely just map py::numpy_scalar<float> to NPY_FLOAT and same with double (and complex numbers) the way it's currently done, what do we write in type signature? Note that float will overlap with builtin float and it's not nice. Also, the size of those floats/doubles is embedded in the resulting binary, so I guess it would be nice to show it (like float16 or float64 or whatever it is)? Any ideas?

Also, I guess we want to bind long double as well then? (and complex<float>, complex<double> - but then it's back to the same question again as above).

Update: ok, apparently a related thing already exists (npy_format_descriptor_name<>) the original version of which I actually added myself a few year back before it was compile-time, lol. I guess we could just map the native floats, doubles, etc, and use it to assign compile-time names to everything. I'll update and push this tomorrow then.

@aldanor
Copy link
Member Author

aldanor commented Jan 9, 2020

Ok, added long doubles and all three types of complex numbers + updated the tests.

@aldanor aldanor force-pushed the feature/numpy-scalar branch 6 times, most recently from 0e15fe7 to e88307b Compare January 9, 2020 14:24
@aldanor
Copy link
Member Author

aldanor commented Jan 9, 2020

Looks green now. (damn all those msvc tests, that took a bit...)

docs/advanced/pycpp/numpy.rst Outdated Show resolved Hide resolved
include/pybind11/numpy.h Show resolved Hide resolved
include/pybind11/numpy.h Outdated Show resolved Hide resolved
@aldanor
Copy link
Member Author

aldanor commented Jan 10, 2020

And now with native int mappings, here are the test failures due to overlaps :) Hmm...

@aldanor
Copy link
Member Author

aldanor commented Jan 11, 2020

So... why do the tests fail? Hm..

Native C types are mapped to NumPy types in a platform specific way: for
instance, ``char`` may be mapped to either ``np.int8`` or ``np.uint8``
depending on the platform. If you want to ensure specific NumPy types,
it is recommended to use fixed-width aliases from ``<cstdint>>``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo:

Suggested change
it is recommended to use fixed-width aliases from ``<cstdint>>``.
it is recommended to use fixed-width aliases from ``<cstdint>``.

@cosama
Copy link

cosama commented Aug 8, 2020

Have been using pybind11 for a project and it works really well, great work on getting here.

I don't know enough about the numpy internals, to help here much I think but it would be very useful for that project to have also a untyped py::numpy_scalar in analogue to py::array. At least it might make sense to name the current implementation py::numpy_scalar_t similar to py::array_t so that a py::numpy_scalar could be added in the future.

@aldanor: Thanks for working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Single precision float treated as doubles, np.float32 incompatible?
3 participants