Add unchecked array access via proxy object #746

jagerman · 2017-03-19T04:46:32Z

My stab at addressing #599/#617.

This adds bounds-unchecked access to arrays through a a.unchecked<Type, Dimensions>() method which returns a proxy object (after checking dimensionality and mutability). For array_t<T>, the Type template parameter is omitted. A readonly version (which doesn't require mutability when being created) is available as a.unchecked_readonly<...>().

(I chose the name unchecked rather than unsafe because it isn't exactly unsafe to use this, it's just potentially unsafe if not done carefully).

The one slightly unusual addition from @dean0x7d 's simple example is specifying the Dimensions as a template parameter. I did this because it allows storage of the strides and shape in std::arrays; having them stored that way (as opposed to storing a copy of the array's strides/shape pointers) allows the
compiler to make significant optimizations that it can't make with a pointer; testing with nested loops of the form:

    for (size_t i0 = 0; i0 < r.shape(0); i0++)
        for (size_t i1 = 0; i1 < r.shape(1); i1++)
            ...
                r(i0, i1, ...) += 1;

over a 10 million element array gave me around a 25% speedup (versus using a pointer) for the 1D case, 33% for 2D, and saw the 5D case run about 2.5 times as fast; the latter, in particular, was a pretty remarkably speedup.

Since the primary purpose of this interface is for unfettered, high-performance access, requiring a compile-time dimensionality seems worthwhile given the gains it can achieve. As a nice side effect, it also lets us catch an invalid number of indices at compile time, and allows operator [] access to be enabled only for 1D arrays.

I haven't added the tests into the PR yet; I wanted to invite comments on the proposal first.

This adds bounds-unchecked access to arrays through a `a.unchecked<Type, Dimensions>()` method. (For `array_t<T>`, the `Type` template parameter is omitted). A readonly version (which doesn't require mutability) is available as `a.unchecked_readonly<...>()`. Specifying the Dimensions as a template parameter allows storage of an std::array; having the strides and sizes stored that way (as opposed to storing a copy of the array's strides/shape pointers) allows the compiler to make significant optimizations of the shape() method that it can't make with a pointer; testing with nested loops of the form: for (size_t i0 = 0; i0 < r.shape(0); i0++) for (size_t i1 = 0; i1 < r.shape(1); i1++) ... r(i0, i1, ...) += 1; over a 10 million element array gives around a 25% speedup (versus using a pointer) for the 1D case, 33% for 2D, and runs more than twice as fast with a 5D array. Since the primary purpose of this interface is to get maximum performance, needing a compile-time dimensionality seems worthwhile given the gains it can achieve.

wjakob · 2017-03-19T14:40:46Z

I thin that this looks great. Regarding the dramatic performance improvements: it's possible that having the dimensions fixed at compile time makes the program structure simple enough that the compiler will add a vectorized path for your example computations (adding entries, etc.).

One really minor comments is that the reference classes should perhaps be in the detail namespace since they are not part of the explicit API. (this is just C++ duck typing where we return another type that behaves like an array).

wjakob · 2017-03-19T14:40:56Z

^ @SylvainCorlay

jagerman · 2017-03-19T14:46:59Z

@wjakob - I expect that's most of it; but it appears to also need the variables to be local to be able to apply that optimization. I'm guessing that with just a pointer, the compiler can't guarantee that something else hasn't changed the value at that memory location, and has to check it each time.

This gives a compile-time failure (rather than a later runtime failure) if attempting to return the proxy object (e.g. from a bound lambda with return type deduction).

aldanor · 2017-03-19T18:01:20Z

This looks great, especially the compile-time ndim on the proxies.

wjakob · 2017-03-19T21:41:38Z

Should the unchecked references also support the data() and mutable_data (if applicable) functions?

SylvainCorlay · 2017-03-19T22:56:45Z

Quick note on the writable flag: mutable accessors return non-const pointers and references, which can be used later on while the numpy array writability flag has changed. If there is no consistent way to support the mutability flag in C++, why not drop it completely?

wjakob · 2017-03-19T23:01:53Z

I don't think I would call that an inconsistency. When a pybind11 function invoked from Python (i.e. the main use case of pybind11), the GIL is held, so there is a well-defined window were we have exclusive access to arrays.

jagerman · 2017-03-19T23:57:02Z

Only enforcing writability at the time of making the reference is also what numpy itself does:

>>> z = np.array([[1,2],[3,4]])
>>> y = np.transpose(z)
>>> z.flags.writeable = False
>>> z[1,1] = 10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only
>>> z
array([[1, 2],
       [3, 4]])
>>> y[1,1] = 10
>>> z
array([[ 1,  2],
       [ 3, 10]])
>>> y
array([[ 1,  3],
       [ 2, 10]])

Yes, you can work around it if you try, but that doesn't mean it isn't still useful.

SylvainCorlay · 2017-03-20T07:10:21Z

OK, fair enough for the mutability check.

SylvainCorlay · 2017-03-20T07:12:16Z

Regarding static dimension, what alternative did you test this against? If you were storing e.g. std::vector shapes, a difference of performance can be attributed to the stack-allocated std::array over heap-allocated std::vector. However, if you were wrapping the underlying pointer, I don't see the reason why it should make a difference.

jagerman · 2017-03-20T13:22:26Z

I compared against both std::vector and storing a copy of the shape()/strides() pointers. Both were indistinguishable from each other.

SylvainCorlay · 2017-03-20T14:03:02Z

ok, it makes sense. So the only problem would be dynamic allocation vs stack allocation.

[ci skip]

wjakob · 2017-03-20T15:12:04Z

@jagerman: would you mind also adding data() and mutable_data()?

Other than that, is there anything else that needs to be done here?

jagerman · 2017-03-20T15:35:59Z

What would data()/mutable_data() do that array.data()/array.mutable_data() doesn't do?

jagerman · 2017-03-20T15:39:44Z

~~I think adding data() would also require adding a stride(n) method; without the stride, I don't see any use for direct access to the data() pointer. But that's easy enough to add.~~ Edit: I was misremembering how data() worked (it's a post-stride data pointer).

dean0x7d

This looks great!

Minor comment on access defaults: the existing functions are read-only by default (e.g. data()/mutable_data(), at()/mutable_at()), while the new functions flip that to read-write by default (unchecked()/unchecked_readonly()). It would be nice to make the default unchecked() call read-only for consitency.

dean0x7d · 2017-03-20T15:20:28Z

docs/advanced/pycpp/numpy.rst

+    m.def("increment_3d", [](py::array_t<double> x) {
+        auto r = x.unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false
+        if (x.ndim() != 3)
+            throw std::runtime_error("error: 3D array required");


I could be missing something, but doesn't the unchecked() call already throw under the same condition?

Whoops, yeah. That's leftover from earlier versions where it didn't.

dean0x7d · 2017-03-20T15:38:17Z

include/pybind11/numpy.h

+     * care: the array must not be destroyed or reshaped for the duration of the returned object,
+     * and the caller must take care not to access invalid dimensions or dimension indices.
+     */
+    template <typename T, size_t Dims> detail::unchecked_reference<T, Dims> unchecked() {


I don't think that both const and non-const versions of this function are needed. So far in pybind11, const has referred to the state of the internal pointer PyObject *m_ptr, but not to the pointee. E.g. attr() is const because obj.attr("something") = value will modify the Python object, but not the pointer stored in obj.

Good point. Additionally with the unchecked() -> mutable_unchecked() rename it makes no sense at all to have a mutable_unchecked() that returns a non-mutable reference.

dean0x7d · 2017-03-20T15:38:56Z

include/pybind11/numpy.h

+
+template <typename T, size_t Dim>
+struct type_caster<unchecked_const_reference<T, Dim>> {
+    static_assert(Dim == (size_t) -1 /* always fail */, "unchecked array proxy object is not castable");


jagerman · 2017-03-20T15:53:12Z

AFAICS, r.mutable_data(1, 2, 3) is going to be exactly equivalent to &r(1, 2, 3); I'm not entirely sure I see the point.

wjakob · 2017-03-20T15:54:39Z

I just thought it is weird that code which uses mutable_data (maybe even without arguments to just get the pointer) would stop working if accessed through an unchecked reference. Thoughts?

wjakob · 2017-03-20T15:55:12Z

(maybe I'm just missing something)

jagerman · 2017-03-20T15:57:48Z

Such code is going to require change anyway, because it's going to have to access through a new proxy object rather than the array itself.

Makes it consistent with other numpy methods (e.g. mutable_at/at). Also delete the const version of `unchecked` that mapped to `unchecked_readonly`: it makes no sense at all with the rename for a `mutable_unchecked` to return a non-mutable object.

jagerman · 2017-03-20T16:21:49Z

Adding a:

template <typename... Ix> const T *data(Ix... ix) const { return &operator()(ix...); }

and exactly the same for mutable_data() (but in the mutable subclass, and without the two consts) would do it, I think; I just don't see that it offers much. (But I've no objection to adding it if you think it's worth having).

wjakob · 2017-03-20T16:26:47Z

I think that would be nice. Ditto for the itemsize() (can now be implemented via sizeof()), size() and nbytes() methods.

dean0x7d · 2017-03-20T17:41:40Z

A few more observations:

Is a runtime version needed? That is, .unchecked(3) in addition to .unchecked<3>()?
Iterators might be nice, but that's a lot more work for nD arrays, so it can probably wait.
We should probably discuss doccomment style consistently since there's a mixture of different styles for block comments /** */ in the codebase (leading asterisk, and without).

jagerman · 2017-03-20T21:18:28Z

I pushed a change implementing dynamic dimensions as the default if the size is omitted (and thus defaults to -1), so you can write:

auto r1 = arr.unchecked<T>();
auto r2 = arr_t.unchecked();

I also added the various utility methods (itemsize(), data(), etc.) and added documentation/tests for both.

That also makes it easier to compare performance; these are for 1-, 2-, and 5-nested loops that increment the value or calculate the sum of all matrix values, for 10M-element matrices of various shapes; the values are the timeit times with number=500 calls:

fixed, 1D, 10M,  incr 3.946771406001062
fixed, 1D, 10M,  sum  4.139643211001385
fixed, 10K x 1K, incr 3.954216455000278
fixed, 10K x 1K, sum  4.140550066000287
fixed, 100x100x10x10x10, incr 4.9838552410001284
fixed, 100x100x10x10x10, sum  4.188112009000179
dynamic, 1D, 10M,  incr 5.28798617999928
dynamic, 1D, 10M,  sum  4.169090997002058
dynamic, 10K x 1K, incr 6.072964282000612
dynamic, 10K x 1K, sum  4.407604636999167
dynamic, 100x100x10x10x10, incr 10.352673255998525
dynamic, 100x100x10x10x10, sum  8.596683744002803

wjakob · 2017-03-22T18:31:14Z

I pushed a change implementing dynamic dimensions as the default if the size is omitted (and thus defaults to -1)

That's super-neat! To me this all looks good (including the docs) -- shall we merge it?

jagerman · 2017-03-22T18:47:47Z

Merged!

I ran another test to see how much of an improvement this makes versus the current checked access. The answer: huge.

checked, 1D, 10M,  incr 31.572535417000836
checked, 1D, 10M,  sum  22.71654115000092
checked, 10K x 1K, incr 44.53464991600049
checked, 10K x 1K, sum  39.15508063499874
checked, 100x100x10x10x10, incr 109.21409312399919
checked, 100x100x10x10x10, sum  102.60653751300015

(numbers are directly comparable to the previous post).

dean0x7d · 2017-03-22T19:10:11Z

That's a pretty great speed up. Have you tried comparing to numpy itself? E.g. np.sum. I wonder if the compile-time dimensions give the proxy array an advantage over some of numpy's builtins.

Side note: The git history might have benefited from a bit of a cleanup. Fixes within the PR are not useful information when looking through global history.

jagerman · 2017-03-22T19:11:23Z

Crap, you're right, I meant to squash that during merge. Can I undo a merge to fix it without messing up too much?

wjakob · 2017-03-22T19:11:24Z

Yeah, that's a lot of commits. @jagerman: do you want to rewrite history and squash this down a bit?

wjakob · 2017-03-22T19:12:09Z

You can just rebase -i on your end and then force-push.

jagerman · 2017-03-22T19:12:21Z

Okay, I'll squash directly on master.

wjakob · 2017-03-22T19:18:04Z

Looks like there is a problem with PyPy again

jagerman · 2017-03-22T19:18:38Z

Rewritten.

jagerman · 2017-03-22T19:21:15Z

PyPy 5.7 was released a few days ago; I suggest we switch the build to use the released version. (Perhaps we can add a 5.8 nightly as a failure-allowed build?).

nbecker · 2017-03-22T19:23:05Z

I'm happy to see speedups, and unchecked access is available. Unfortunately from my perspective, it isn't default and requires extra (code) steps. My typical use would be where indexes are correct by design, and checking access is a waste. It would be nice if I could write:
void F(unchecked_array<int,2> A)
And have all the same methods available on unchecked_array as on the default checked version (docs now mention a subset of methods available)

jagerman · 2017-03-22T19:50:11Z

There are tradeoffs: the biggest issue/objection was that it would be encoding the access mechanism by using a fundamentally different C++ type for safe versus unsafe access arrays, despite the type representing exactly the same underlying object. You'd need yet another C++ type (either under a separate name, or via a template parameter--either way still a distinct type) to distinguish between read-only and mutable input arrays, so now we'd have 3 C++ types to represent exactly the same object--and they'd need implicit conversions defined so that you could switch between them at will (with mutability checks when switching from a read-only to mutable type). There would also be a large amount of method duplication to support it all.

So it's possible, but I'm not really convinced that it's worth the cost just to avoid needing to write auto r = a.unchecked<2>().

jagerman added this to the v2.2 milestone Mar 19, 2017

jagerman mentioned this pull request Mar 19, 2017

Unsafe access #617

Closed

jagerman added 4 commits March 19, 2017 13:12

Make proxy object explicitly non-castable

aeb7f4e

This gives a compile-time failure (rather than a later runtime failure) if attempting to return the proxy object (e.g. from a bound lambda with return type deduction).

array proxy: add tests

3195d8f

Move proxy object to detail namespace

3ce0a55

Fix template disabling of operator[]

09ee40e

jagerman mentioned this pull request Mar 19, 2017

Make a v2.1 release? #726

Closed

wjakob modified the milestones: v2.1, v2.2 Mar 19, 2017

jagerman changed the title ~~WIP: add unchecked array access via proxy object~~ Add unchecked array access via proxy object Mar 20, 2017

Docs: mention writeability, minor rewording

f4c86eb

[ci skip]

dean0x7d reviewed Mar 20, 2017

View reviewed changes

jagerman added 2 commits March 20, 2017 13:09

Remove unnecessary dimension check

761ed3d

add runtime dimension support; add some array-compatible methods

9d8f39e

Add drop-in array/proxy API test

8317fc4

jagerman merged commit cc88278 into pybind:master Mar 22, 2017

dean0x7d mentioned this pull request Jun 7, 2017

please consider adding index operator to numpy.h array_t #621

Closed

jagerman mentioned this pull request Jul 26, 2017

Could unchecked views prevent garbage collection of the underlying arrays? #961

Closed

rwgk mentioned this pull request Feb 9, 2023

FWD pybind11 google/pybind11k#746

Closed

Add unchecked array access via proxy object #746

Add unchecked array access via proxy object #746

Conversation

jagerman commented Mar 19, 2017 • edited Loading

wjakob commented Mar 19, 2017

wjakob commented Mar 19, 2017

jagerman commented Mar 19, 2017

aldanor commented Mar 19, 2017

wjakob commented Mar 19, 2017

SylvainCorlay commented Mar 19, 2017

wjakob commented Mar 19, 2017

jagerman commented Mar 19, 2017

SylvainCorlay commented Mar 20, 2017

SylvainCorlay commented Mar 20, 2017

jagerman commented Mar 20, 2017

SylvainCorlay commented Mar 20, 2017

wjakob commented Mar 20, 2017

jagerman commented Mar 20, 2017

jagerman commented Mar 20, 2017 • edited Loading

dean0x7d left a comment

Choose a reason for hiding this comment

dean0x7d Mar 20, 2017

Choose a reason for hiding this comment

jagerman Mar 20, 2017

Choose a reason for hiding this comment

dean0x7d Mar 20, 2017

Choose a reason for hiding this comment

jagerman Mar 20, 2017

Choose a reason for hiding this comment

dean0x7d Mar 20, 2017

Choose a reason for hiding this comment

jagerman commented Mar 20, 2017

wjakob commented Mar 20, 2017 • edited Loading

wjakob commented Mar 20, 2017

jagerman commented Mar 20, 2017

jagerman commented Mar 20, 2017

wjakob commented Mar 20, 2017

dean0x7d commented Mar 20, 2017

jagerman commented Mar 20, 2017

wjakob commented Mar 22, 2017

jagerman commented Mar 22, 2017

dean0x7d commented Mar 22, 2017

jagerman commented Mar 22, 2017

wjakob commented Mar 22, 2017

wjakob commented Mar 22, 2017

jagerman commented Mar 22, 2017

wjakob commented Mar 22, 2017

jagerman commented Mar 22, 2017

jagerman commented Mar 22, 2017

nbecker commented Mar 22, 2017

jagerman commented Mar 22, 2017

jagerman commented Mar 19, 2017 •

edited

Loading

jagerman commented Mar 20, 2017 •

edited

Loading

wjakob commented Mar 20, 2017 •

edited

Loading