Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permit creation of NumPy arrays with a "base" object that owns the data #440

Merged
merged 3 commits into from
Oct 13, 2016

Conversation

wjakob
Copy link
Member

@wjakob wjakob commented Oct 12, 2016

This patch adds an extra base handle parameter to most py::array and
py::array_t<> constructors. If specified along with a pointer to
data, the base object will be registered within NumPy, which increases
the base's reference count. This feature is useful to create shallow
copies of C++ or Python arrays while ensuring that the owners of the
underlying can't be garbage collected while referenced by NumPy.

The commit also adds a simple test function involving a wrap()
function that creates shallow copies of various N-D arrays.

This patch adds an extra base handle parameter to most ``py::array`` and
``py::array_t<>`` constructors. If specified along with a pointer to
data, the base object will be registered within NumPy, which increases
the base's reference count. This feature is useful to create shallow
copies of C++ or Python arrays while ensuring that the owners of the
underlying can't be garbage collected while referenced by NumPy.

The commit also adds a simple test function involving a ``wrap()``
function that creates shallow copies of various N-D arrays.
@wjakob
Copy link
Member Author

wjakob commented Oct 12, 2016

This feature is related to discussions on #383 and is up for review (one of my projects needed the ability to create shallow copies of NumPy arrays, so I just went ahead and implemented it.)

cc @aldanor -- does it look reasonable to you?

@@ -156,8 +156,10 @@ NAMESPACE_END(detail)
(reinterpret_cast<::pybind11::detail::PyArray_Proxy*>(ptr)->attr)
#define PyArrayDescr_GET_(ptr, attr) \
(reinterpret_cast<::pybind11::detail::PyArrayDescr_Proxy*>(ptr)->attr)
#define PyArray_FLAGS_(ptr) \
(reinterpret_cast<::pybind11::detail::PyArray_Proxy*>(ptr)->flags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather just PyArray_GET_(flags) :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

tmp = object(api.PyArray_NewCopy_(tmp.ptr(), -1 /* any order */), false);
if (ptr) {
if (base) {
PyArray_GET_(tmp.ptr(), base) = base.inc_ref().ptr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read this part of numpy api in detail, but my guess is that it's our job to incref the owner, and it's numpy's job to decref it when the array dies, is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right.

flags = base_array.flags() & ~detail::npy_api::NPY_ARRAY_OWNDATA_;
else
/* Writable by default, easy to downgrade later on if needed */
flags = detail::npy_api::NPY_ARRAY_WRITEABLE_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, agreed, writeable should probably be the default when constructing arrays as we discussed earlier (and read-only a default when passing them around as args).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpret this comment as an observation with no action items :)

@@ -319,6 +345,11 @@ class array : public buffer {
return (size_t) PyArray_GET_(m_ptr, nd);
}

/// Base object
object base() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, is there a use case for exposing the base?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumPy has a .base flag, so I thought it would be nice to expose it in the C++ API as well.

This feature (pointer into memory owned by a "base") is really neat for exposing a NumPy "view" into a data structure that may be implemented in a completely different programming language. The base pointer lets you get back from the numpy view to the parent class, which can be useful in a function that is called without knowing the latter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(whether that last approach is a good design pattern is another matter)

@@ -158,3 +159,53 @@ def test_make_c_f_array():
assert not make_c_array().flags.f_contiguous
assert make_f_array().flags.f_contiguous
assert not make_f_array().flags.c_contiguous


@pytest.requires_numpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a separate test for base lifetime? E.g., create base, bind array to it, del/decref base, make sure the array and base are still alive, kill array, make sure the base is also dead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will add one.

@wjakob
Copy link
Member Author

wjakob commented Oct 13, 2016

merged -- thanks for the feedback.

try {
file = module::import("sys").attr("stdout");
} catch (const import_error &) {
/* If print() is called from code that is executed as
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dean0x7d : FYI

Copy link
Member

@dean0x7d dean0x7d Oct 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting situation. I guess that comes up with the ConstructorStats output from destructors. Wouldn't it be better to add the try/catch to ConstructorStats rather than penalize every py::print call?

In general, any Python code executed in a destructor can throw an exception, so this isn't really isolated to py::print. Even with pure C++ code it's up to the destructor to make sure that no exceptions escape -- the called functions don't need to be aware if they are running from a destructor or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It came up when I added py::print to a destructor that (as part of implementing a new testcase) got called during interpreter shutdown. Many Python calls will still work in that situation, but importing new things is definitely forbidden. py::print is special in this regard that it needs to import sys, and then stdout, hence the decision to modify it.

@aldanor
Copy link
Member

aldanor commented Oct 13, 2016

This is great, thanks! :)

@rwgk rwgk mentioned this pull request Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants