Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Have dtype transfer for equivalent user dtypes prefer user-defined `copyswapn` #10898

Merged
merged 1 commit into from May 28, 2018

Conversation

Projects
None yet
3 participants
@EricCousineau-TRI
Copy link
Contributor

commented Apr 12, 2018

Resolves #10897

This is a simple way that initially enables things like exposing autodiff scalars with heap-allocated derivatives. See the original issue for more info.

This Travis build job shows the code working with this patch (well, an older version of it):
https://travis-ci.org/RobotLocomotion/pybind11/jobs/364928883#L1630-L1635

\cc @njsmith - Since this is a relatively small change, figured I'd see if this appropriate to upstream at this point.

@eric-wieser

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

Is there a reason that enabling PyDataType_REFCHK doesn't work for you?

Perhaps renaming REFCHK to be NONTRIVIAL_COPY would be useful?

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented Apr 13, 2018

Hmm... My understanding, from having looked at some of the NumPy v1.11.0 source (what I'm using on Ubuntu), is that REFCHK implied that Py_INCREF / Py_DECREF could be called on the given object; in this case, the underlying memory of an AutoDiffXd dtype is not a Python object, but rather than the direct C++ memory, so it could cause issues.

Looking at master now, it seems that PyArray_INCREF and PyArray_XDECREF should work since typenum != NPY_OBJECT; however, routines like PyArray_FillObjectArray, used from PyArray_Zeros (_zerofill) and PyArray_Empty (used by np.full) appear to default to assuming that the memory is based on PyObject* (via NPY_COPY_PYOBJECT_PTR), which would cause segfaults (TTBOMK). Overall, it seems the policy for REFCHK is applied inconsistently?

Will still try it out on my test case to see if it works out, and post the results here. Thanks!

EDIT: Checked against v1.11.0 (unpatched) from system, and it works for np.copy, but segfaults when I use np.zeros. Will see how it works with master.

EDIT 2: Confirmed that np.zeros still segfaults against master (1.15.0.dev0+3ec8875) :(

If these inconsistencies were fixed, then NONTRIVIAL_COPY would definitely be useful! However, perhaps it could be named NONTRIVIAL_CTOR_DTOR_COPY to also be general enough to address #10721, as reference counting is just a general case of construction / destruction? (assuming there is space in user dtype API to add hooks for construction / destruction, or just some capsules that can store those methods in the dtype or class?)

That also being said, if there is a hook to cause np.zeros to fail fast with an exception and not a segfault, then it'd be easier to direct users to use casting (e.g. rather than np.zeros(..., dtype=AutoDiffXd), use np.zeros(...).astype(AutoDiffXd)). Is there a way to do this?
(At present, on the pybind11 branch, I have to do this for np.ones, as v1.11.0 uses some esoteric casting from np.int64 or something to implement np.ones...)

EDIT 3: I may risk the segfaults with np.zero, as long as that's the main thing that fails with this approach. Will test out a few more things and see if this is a viable route to avoid the need to fork to achieve the desired functionality.
EDIT 3.5: Seems that np.empty also suffers from the same behavior - added an edit above. So things like np.full will also cause segfaults. It could be mitigated by replacing np.full(..., AutoDiffXd(10)) something like x = np.zeros(...).astype(AutoDiffXd); x[:] = AutoDiffXd(10), but that seems to be stretching it a bit.

@eric-wieser

This comment has been minimized.

Copy link
Member

commented Apr 13, 2018

Looking again - seems you're right, REFCHK isn't appropriate at all.

Unfortunately, the dtype->flags field is a char, and all 8 flags are currently used.

So maybe this is the best call. I worry that there will be a lot of places that just assume they can memcpy a user dtype though, so this will be the first of many changes you need to make.

Why is it that a memcpy doesn't work for you?

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented Apr 13, 2018

Ah, gotcha. For the most part, though, it seems that the main algorithms do well in using the correct dispatches to copy arrays. Though I'm sure I'll run into that issue at some point. It'll be an interesting adventure, then :P

Why is it that a memcpy doesn't work for you?

In this case, AutoDiffXd (Eigen::AutoDiffScalar<Eigen::VectorXd>) has its value() stored immediately in the structure (which is trivially copyable), while its derivatives() are stored using Eigen::VectorXd, which is not trivially copyable since it has a pointer for its allocated data, and thus can't be memcpyd as the memory will be incorrectly aliased (and will later cause a double-free if the original and the copy have their destructors called or are assigned over).

EDIT: This may be jumping the design gun, but in the future, could PyArray_Descr::typeobj be used to store the additional metadata (e.g. ctor / dtor flags, ctor / dtor functions (capsules), etc.)? Or perhaps the dtype.metadata field?

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2018

Just checking back in on this PR - may I ask if there's anything I may do to increase confidence in its change to see if it can be merged into master? (testing, etc.?)

Perhaps instrument some user-defined dtype test to check which user-defined methods have been called?

@eric-wieser

This comment has been minimized.

Copy link
Member

commented Apr 20, 2018

I think this is probably fine as it. The change is pretty straightforward, and I assume your code works with it.

There's a slight performance penalty here for custom dtypes, but I imagine it's pretty minor - those users can presumably just call PyArray_GetStridedCopyFn internally, at which the cost becomes just a few extra layers of indirection, which doesn't scale with array size.

If no one else has any opinions on this, I'll merge it in a few days. Tagging with 1.15 so I don't forget.

@eric-wieser eric-wieser added this to the 1.15.0 release milestone Apr 20, 2018

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2018

Awesome, thanks!

There's a slight performance penalty here for custom dtypes, but I imagine it's pretty minor - those users can presumably just call PyArray_GetStridedCopyFn internally [...]

One concern is it looks like PyArray_GetStridedCopyFn is decorated with NPY_NO_EXPORT / NPY_VISIBILITY_HIDDEN - will users be able to call this method?
Is it something that should be part of the exported API, or perhaps offered as a separate, internal capsule? (like np._low_level["PyArray_GetStridedCopyFn"]?)

EDIT: Er, just to clarify, I don't think this PR should be blocked on account of a convenience method being user-accessible; for the time being, it seems like it's more-or-less trivial to write a non-swap version of copyswapn; as a simple example:
https://github.com/EricCousineau-TRI/pybind11/blob/e712b18acc86bd8c61df7c92433f1299d127236e/include/pybind11/numpy_dtypes_user.h#L700

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2018

@eric-wieser Just a small ping, if you're able to merge it - thanks!

(Would love to employ this on pydrake to parallel the C++ convention of scalar types for converting systems from double to AutoDiffXd and symbolic::Expression!)

@EricCousineau-TRI EricCousineau-TRI force-pushed the EricCousineau-TRI:prefer-user-copyswapn branch from 7d247f4 to 8507eb3 May 16, 2018

@eric-wieser

This comment has been minimized.

Copy link
Member

commented May 18, 2018

it seems like it's more-or-less trivial to write a non-swap version of copyswapn

My point is that PyArray_GetStridedCopyFn has gone to great lengths to try and be efficient, and a user implementation is unlikely to do so. Not a big deal though.

Can you perhaps add something to the release notes about this change?

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented May 18, 2018

Yup, can do!

Should I file an issue to consider exposing PyArray_GetStridedCopyFn publicly, so that user-defined dtypes can leverage it?

@EricCousineau-TRI EricCousineau-TRI force-pushed the EricCousineau-TRI:prefer-user-copyswapn branch from 8507eb3 to 682403e May 18, 2018

@EricCousineau-TRI

This comment has been minimized.

Copy link
Contributor Author

commented May 18, 2018

Done. Added an initial pass at release notes; please let me know what you'd like me to change.

* return a simple copy
*/
if (PyArray_EquivTypes(src_dtype, dst_dtype) &&
!PyDataType_REFCHK(src_dtype) && !PyDataType_REFCHK(dst_dtype) &&
( !PyDataType_HASFIELDS(dst_dtype) ||
is_dtype_struct_simple_unaligned_layout(dst_dtype)) ) {
is_dtype_struct_simple_unaligned_layout(dst_dtype)) &&

This comment has been minimized.

Copy link
@charris

charris May 25, 2018

Member

Indentation needs fix.

This comment has been minimized.

Copy link
@EricCousineau-TRI

EricCousineau-TRI May 25, 2018

Author Contributor

I had meant to have this align with col 13 since it's part of the conjunction && operands, so that it should align with the beginning of the parentheses (and !PyDataType_REFCHK(...l).

If you want, I can move this line up to make the continuation more evident?

This comment has been minimized.

Copy link
@charris

charris May 27, 2018

Member

NVM, I misread the code.

@charris

This comment has been minimized.

Copy link
Member

commented May 25, 2018

@eric-wieser Ping.

@charris

This comment has been minimized.

Copy link
Member

commented May 27, 2018

LGTM. @eric-wieser I'll leave this to you.

@eric-wieser eric-wieser merged commit 515900d into numpy:master May 28, 2018

6 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: C/C++ No alert changes
Details
lgtm analysis: JavaScript No alert changes
Details
lgtm analysis: Python No alert changes
Details
@eric-wieser

This comment has been minimized.

Copy link
Member

commented May 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.